Embracing .grib Files and Freeing Ourselves from Weather APIs

Embracing .grib Files and Freeing Ourselves from Weather APIs

Crafting a Weather Forecast App with the MERN Stack (MongoDB, Express, React, Node.js) ... Part 1

Featured on Hashnode

Do not rely on a weather API

As a windsurfer, I heavily rely on weather forecast apps, and I thought it might be fun to build one myself. To make it more interesting, I decided not to use any of the many weather APIs.

I like my data RAW

I searched for ways to obtain the raw data or, at the very least, weather maps or arrays. During my search, I stumbled upon the website of the DWD (German weather service) and discovered that they provide .grib files containing forecast values. Since I am based in Germany, I wanted to use data that covers this area. Other countries have similar websites that cover their respective regions.

How to obtain these files

The required files are downloadable via FTP. In my region, the ICON forecast model is highly accurate, so I chose to use it. The files are available at opendata.dwd.de. Every 3 hours, the forecast is updated, and new .grib files become available.

Here's the process of getting the .grib files:

  1. Check if new .grib files are available

  2. Download the .grib files

  3. Extract the .bz2 archives to get the .grib files

Now, let's start coding!

First, we create a new folder for the project and change the directory to the new folder. Then we initialize a new npm package.

mkdir windspotter
cd windspotter
npm init -y

Now, we create a src folder in which we'll place our index.js.

mkdir src
touch src/index.js

The next step is to edit our package.json. We want to point our start script to the recently created index.js. Later, we can run our application with npm start.

// package.json
...
"scripts": {
    "start": "node ./src/index.js"
},
...

Let's install some npm packages we need to get the .grib files from the FTP server. We'll use Basic-FTP for downloading the files and Decompress and Decompress-bzip2 for extracting the .bz2 files. While other packages are available, I chose Basic-FTP for its simplicity and Decompress because I don't want to deal with streams.

npm install basic-ftp decompress decompress-bzip2

Effortless Organization for Smooth Sailing!

To stay organized, we will create a few more folders and files. I prefer to have different tasks handled in separate modules. While it's possible to code everything in one file, I like to tackle the problems one step at a time.

mkdir src/ftp src/config grib_data
touch src/ftp/index.js src/config/index.js

Easy Maintenance

We create a config file so that we have all the values that might change over time in one place.

DataValues is an array with the measured values that we need for our forecast.

The measured values can be displayed on different grids. We use the regular-lat-lon model where the grid follows the geographic coordinate system. This makes it easy for us to search the measured values for specific coordinates.

The vertical level is described by FcHeight, which consists of 65 layers. Layer 0 is the highest, and layer 65 is closest to the ground. We want the values that were determined close to the ground, which is why we have _65_ here. In other forecast models, the number of layers may vary.

// src/config/index.js
const dataValues = ['t', 'v', 'u'];
const fCModel = 'regular-lat-lon_model';
const fCHeight = '_65_';

module.exports = {
  dataValues,
  fCModel,
  fCHeight,
};

List, Download, Extract, Repeat

We create a new asynchronous function with the name downloadFiles. Later, we will pass the latest update time (databaseTimestamp) as an argument. The fallback value for databaseTimestamp is a timestamp with the date 0 (January 01, 1970).

Try and Hopefully Never Catch

We initiate the FTP connection in a try-catch block. This way, if anything goes wrong, we can catch the error and continue to run the code.

After connecting to the server with client.access, we obtain a list of folders for our given path. The names of these folders represent the update times of the forecast model.

Pick the right time

Now, we need to determine which folder is the most current one at runtime. For now, we will pass the folderList to a function. Later, we will create that function. Let's assume that we get back the name of the folder with the latest forecast.

We change the directory to the one with the latest forecast and retrieve a list of files from the folder. Once again, we will pass that list to a function. This function will give us back the update time of those files.

Check If the Database Is Up to Date

In the last step, we will check if the update time of the forecast is newer than our database update time. If the files on the server are more recent, we will start to download those files.

Look What We Have So Far.

// src/ftp/index.js
const ftp = require('basic-ftp');
const { dataValues, fCModel, fCHeight } = require('../config');

...

const downloadFiles = async (databaseTimestamp = new Date(0)) => {
  const server = 'opendata.dwd.de';
  const dict = 'weather/nwp/icon-d2/grib';
  const client = new ftp.Client();

  try {
    await client.access({
      host: server,
    });
    // get a list of folders from the given ftp path
    const dirList = await client.list(dict);
    const forecastTimes = dirList.map((folderInfo) => folderInfo.name);
    // get the latest forecast folder name
    const nextForecastTime = getNextForecastTime(forecastTimes);
    await client.cd(`${dict}/${nextForecastTime}`);
    const fileList = await client.list();
    // get the last update time from the requested files
    const serverTimestamp = getServerTimestamp(fileList);
    // check if the files are older than the data in our database
    if (
      serverTimestamp.get < databaseTimestamp ||
      new Date() - serverTimestamp < 5 * 60 * 1000
    ) {
      console.log('database is up to date');
      client.close();
      return false;
    }
    // create a list of the files und download them

    ...

    client.close();
    return nextForecastTime; 
  } catch (err) {
    console.log(err);
    return false;
  }
};

What Happens in the Blackbox?

Let's write our getNextForecastTime() function. The purpose of this function is to find the update hour that is the least in the past.

First, we need to convert our Array of Strings to an Array of Numbers. Next, we get the current hour when the code is executed. I have to work in UTC +0 because the DWD Forecast is also in UTC +0.

The next step is to filter the Array and remove all values that are higher than our current hour. Using Math.max(), we get the highest value of the resulting array.

We convert the number to a string with a leading '0' and return it.

// src/ftp/index.js
...
const getNextForecastTime = (forecastTimes) => {
  // convert Strings into Numbers
  const forecastTimesNumbers = forecastTimes.map((hour) => parseInt(hour, 10));
  // get the hour of the current time
  const hourNow = new Date().getUTCHours();
  // get the latest forcastTime to hour current time
  const nextForecastTime = Math.max(
    ...forecastTimesNumbers.filter((hour) => hour <= hourNow),
  );
  // return the number as string with leading zeros
  return String(nextForecastTime).padStart(2, '0');
};
...

Are We Up to Date?

In the getFileTimestamps() function, we will return the times when the files have been updated.

We have to convert the modifiedAt string of the file to a timestamp. The special thing here is that there is no year in that timestamp. To fix this problem, we will take the current year from new Date().

However, taking the current year poses a corner case. On January 01 in each year, it is possible that we want to get the files from December 31 in the previous year. By setting the current year, we will produce a timestamp in the future. To prevent this behaviour, we will check if the server timestamp is newer than the current date. If so, we will subtract one year from the timestamp.

After mapping the generated timestamps into the array, we will return the new array.

// src/ftp/index.js
...
const getFileTimestamps = (files) => {
  const dateNow = new Date();
  return files.map((file) => {
    // split the date Sting and create a timestamp from it
    const modDateArr = file.rawModifiedAt.split(' ');
    const timestamp = new Date(
      `${modDateArr[0]} ${modDateArr[1]}, ${dateNow.getFullYear()} ${
        modDateArr[2]
      }+00:00`,
    );
    // Jan 01 cornercase
    if (timestamp > dateNow) {
      timestamp.setFullYear(timestamp.getFullYear() - 1);
    }
    return timestamp;
  });
};
...

Finally, Our Loop:

Now that we have created all the necessary functions and variables, we can start to download the files. We loop over our array of dataValues and change the directory to the present value. We will generate a list of all files in this directory and filter out the required files by their names.

Finally, we also loop over these filenames and start downloading file by file. Right after the download of one file is finished, we will decompress these files before moving to the next one. The decompression is done by the decompressFile() function. We pass in the file name and the directory of the file, and the function will do its magic.

// src/ftp/index.js
...
// create a list of the files und download them
    for (const value of dataValues) {
      let clientList = await client.list(`./${value}`);
      // filter out the unwanted files
      clientList = clientList
        .map((file) => file.name)
        .filter((name) => name.includes(fCModel) && name.includes(fCHeight));
      // download file per file
      for (const file of clientList) {
        await client.downloadTo(`./grib_data/${file}`, `./${value}/${file}`);
        await decompressFile(file, './grib_data/');
      }
    }
...

Decompress the .bz2

In the asynchronous function decompressFile(), the filenames and the path will be concatenated and passed into the decompress method from the decompress module. As a plugin, we specify the decompressBzip2 module.

After decompressing, we delete the source .bz2 file, and we are done with the magic.

// src/ftp/index.js
...
const fs = require('fs');
const decompress = require('decompress');
const decompressBzip2 = require('decompress-bzip2');
...
const decompressFile = async (file, path) => {
  const regex = /.*(?=.bz2)/;
  await decompress(`${path}/${file}`, './', {
    plugins: [
      decompressBzip2({
        path: `${path}/${file.match(regex)[0]}`,
      }),
    ],
  });
  await fs.unlinkSync(`${path}/${file}`);
  fs.chmodSync(`${path}/${file.match(regex)[0]}`, 0o755);
};
...

Conclusion

In this blog post, I shared the first part of my journey in building a weather forecast app, without relying on weather APIs. Instead, I utilized free .grib files from the DWD (German weather service) through FTP.

By leveraging the raw data provided by .grib files, I started creating a weather forecasting App. With the help of npm packages like Basic-FTP and Decompress, I managed to download and decompress the desired .grib files.

At this stage, the weather app has completed the crucial step of downloading the forecast files. However, we are only getting started. In the upcoming parts of this series, we will delve into processing the downloaded .grib files, converting them to JSON, and storing the essential forecast information in MongoDB. With each of these developments, we come a step closer to achieving a fully functional weather forecast app.

Stay tuned for the next part of the series as we continue to build our weather forecast app! Your comments and feedback are most welcome as I embark on this exciting endeavour to learn and grow.

THE CODE:

// src/ftp/index.js
const fs = require('fs');
const ftp = require('basic-ftp');
const decompress = require('decompress');
const decompressBzip2 = require('decompress-bzip2');

const { dataValues, fCModel, fCHeight } = require('../config');

const getNextForecastTime = (forecastTimes) => {
  // convert Strings into Numbers
  const forecastTimesNumbers = forecastTimes.map((hour) => parseInt(hour, 10));
  // get the hour of the current time
  const hourNow = new Date().getUTCHours();
  // get the latest forcastTime to hour current time
  const nextForecastTime = Math.max(
    ...forecastTimesNumbers.filter((hour) => hour <= hourNow),
  );
  // return the number as string with leading zeros
  return String(nextForecastTime).padStart(2, '0');
};

const getFileTimestamps = (files) => {
  const dateNow = new Date();
  return files.map((file) => {
    // split the date Sting and create a timestamp from it
    const modDateArr = file.rawModifiedAt.split(' ');
    const timestamp = new Date(
      `${modDateArr[0]} ${modDateArr[1]}, ${dateNow.getFullYear()} ${
        modDateArr[2]
      }+00:00`,
    );
    // Jan 01 cornercase
    if (timestamp > dateNow) {
      timestamp.setFullYear(timestamp.getFullYear() - 1);
    }
    return timestamp;
  });
};

const decompressFile = async (file, path) => {
  const regex = /.*(?=.bz2)/;
  await decompress(`${path}/${file}`, './', {
    plugins: [
      decompressBzip2({
        path: `${path}/${file.match(regex)[0]}`,
      }),
    ],
  });
  await fs.unlinkSync(`${path}/${file}`);
  fs.chmodSync(`${path}/${file.match(regex)[0]}`, 0o755);
};

const getServerTimestamp = (fileList) => {
  // reduce array to only get the required values
  const sortedFiles = fileList.filter((file) => dataValues.includes(file.name));
  const fileTimestamps = getFileTimestamps(sortedFiles);
  // return latest Timestamp from folder
  return new Date(Math.max(...fileTimestamps));
};

const downloadFiles = async (databaseTimestamp = new Date(0)) => {
  const server = 'opendata.dwd.de';
  const dict = 'weather/nwp/icon-d2/grib';
  const client = new ftp.Client();

  try {
    await client.access({
      host: server,
    });
    // get a list of folders from the given ftp path
    const dirList = await client.list(dict);
    // convert list of folders to list of folderNames (forecastTimes)
    const forecastTimes = dirList.map((folderInfo) => folderInfo.name);
    // get the latest forecast folder name
    const nextForecastTime = getNextForecastTime(forecastTimes);
    await client.cd(`${dict}/${nextForecastTime}`);
    const fileList = await client.list();
    // get the last update time from the requested files
    const serverTimestamp = getServerTimestamp(fileList);
    // check if the files are older than the data in our database
    if (
      serverTimestamp.get < databaseTimestamp ||
      new Date() - serverTimestamp < 5 * 60 * 1000
    ) {
      console.log('database is up to date');
      client.close();
      return null;
    }

    // create a list of the files und download them
    for (const value of dataValues) {
      let clientList = await client.list(`./${value}`);
      // filter out the unwanted files
      clientList = clientList
        .map((file) => file.name)
        .filter((name) => name.includes(fCModel) && name.includes(fCHeight));
      // download file per file
      for (const file of clientList) {
        await client.downloadTo(`./grib_data/${file}`, `./${value}/${file}`);
        await decompressFile(file, './grib_data/');
      }
    }
    console.log('download complete');
    client.close();
    return nextForecastTime;
  } catch (err) {
    console.log(err);
    return false;
  }
};

module.exports = {
  downloadFiles,
};
// src/index.js
const { downloadFiles } = require('./ftp');

downloadFiles();

Did you find this article valuable?

Support Stonehagen by becoming a sponsor. Any amount is appreciated!