Home > Blog >

Logflare to NCSA (nginx) Server Log Format Nodejs Script

Logflare to NGINX Access Log (NCSA Extended) Format Using Nodejs

Logflare is Awesome.

If you use cloudflare for your site, and don't pay for enterprise, server logs can become a problem if you rely on edge caching.

Enter Logflare. This awesome service comes with an easily installable cloudflare app, and streams access data to a BigQuery backend.

The guides on the site will do a far better job than me of explaining how to set it up, so I recommend reading those.

Server Logs are Awesome too Though

You might already have an in-house solution, or use something like the brilliant Screaming Frog Log File Analyser, so all that lovely data in BigQuery is great, but how do you use it without retooling?

That was the problem I faced for a project, so I quickly hacked up this nodejs script that writes the data in NCSA Extended format, as used by nginx. By default logflare doesn't log content length, and it's not something I've needed, so here the script just enters 0 for all lines.

The Script

This assumes that you have logflare enabled, working and recording data to your BigQuery account.

You will need to add the BigQuery node package by running npm install @google-cloud/bigquery at the command line.

You will also need to follow the instructions here Authenticating as a service account to create a service account and download the JSON key file

Then edit the config details as you like, these are all at the top of the script

                
                    // require libs
                    // run npm install @google-cloud/bigquery
                    const { BigQuery } = require('@google-cloud/bigquery');
                    const fs = require('fs');
                    // BigQuery Config - see https://cloud.google.com/docs/authentication/production#create_service_account
                    const options = {
                      keyFilename: '{path_to_key_file}',
                      projectId: '{project_id}',
                    };
                    
                    // the name of the logfile to create
                    const logName = 'access.log';
                    
                    // your logflare BigQuery table 
                    const table = "{table_name}";
                    
                    // time period, uncomment / comment, or add your own
                    /*
                    ---------
                    ALL TODAY
                    ---------
                    */
                    // const period = `TIMESTAMP_TRUNC(CURRENT_TIMESTAMP(), DAY) AND CURRENT_TIMESTAMP()`;
                    
                    /*
                    -------------
                    ALL YESTERDAY
                    -------------
                    */
                    const period = `TIMESTAMP_TRUNC(TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY), DAY) AND TIMESTAMP_TRUNC(CURRENT_TIMESTAMP(), DAY)`;
                    
                    /*
                    -------------
                    LAST 24 HOURS
                    -------------
                    */
                    // const period = `TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY) AND CURRENT_TIMESTAMP()`;
                    
                    
                    /* end config */
                    
                    const bigquery = new BigQuery(options);
                    async function createLogfile() {
                    
                    
                      const query = `SELECT
                      FORMAT_TIMESTAMP('%d/%b/%G:%H:%M:%S', t0.timestamp) AS formattedTimestamp,
                      t0.id,
                      f4.httpProtocol,
                      f3.referer,
                      t0.event_message
                    FROM
                      ${table} AS t0
                      INNER JOIN UNNEST(t0.metadata) AS f1 ON TRUE
                      INNER JOIN UNNEST(f1.request) AS f2 ON TRUE
                      INNER JOIN UNNEST(f2.headers) AS f3 ON TRUE
                      INNER JOIN UNNEST(f2.cf) AS f4 ON TRUE
                    WHERE
                     (t0.timestamp BETWEEN ${period})
                    ORDER BY
                      t0.timestamp ASC
                    `;
                    
                      const options = {
                        query: query,
                        location: 'US',
                      };
                    
                      const [job] = await bigquery.createQueryJob(options);
                      console.log(`Job ${job.id} started.`);
                    
                      const [rows] = await job.getQueryResults();
                    
                      console.log('Rows:');
                      let logContent = '';
                      rows.forEach(row => {
                        const parsed = row.event_message.split(' | ');
                        let ref = `"-"`;
                        if (row.referer) {
                          ref = `"${row.referer}"`;
                        }
                        let formattedRow = `${parsed[2]} - - [${row.formattedTimestamp} +0000] "${parsed[0]} ${parsed[4]} ${row.httpProtocol}" ${parsed[1]} 0 ${ref} "${parsed[5]}"`;
                        console.log(formattedRow);
                        logContent += `${formattedRow}\n`;
                      }
                      );
                      fs.writeFileSync(logName, logContent, 'utf-8')
                    }
                    createLogfile();
                
            
Github View in Github Gist

Running this file using node {filename_you_save_as}.js will create a log file with the name specified.


About the Author:

Dave Smart

Dave Smart

Technical SEO Consultant at Tame the Bots.