Skip to content
Advertisement

How can I add a timestamp to every line in a piped log output?

I have to run some rather lengthy and legacy PHP scripts using cron. I want to capture the pid, output (both std and err) and log it to a file, but I would also like every line of this output to be timestamped. These scripts are quite fragile and it is not viable to properly instrument them with logging.

I came up with a bit of bash + gawk script that accomplishes this like so:

# run php in a subshell, capturing the PID and redirecting the output.
(php $@ 2>&1 & echo $! >&3) 3> $PID_FILE | gawk 
    -v php_pid=$(while ! test -s $PID_FILE; do sleep .1; done; cat $PID_FILE) 
    '{ print "[" php_pid strftime(" %Y-%m-%d %H:%M:%S] "), $0 }' 
    >> $LOG_FILE

This works OK, however, it’s a rather convoluted and obscure solution to an otherwise simple problem.

Is there a simpler, better solution or tool to accomplish this?

Advertisement

Answer

an actual alternative –

Let awk handle the wait.

mkfifo pipe_name           # a FIFO as a tempfile for un-timestamped data
php $@ > pipe_name 2>&1 &  # your program in background to the FIFO
echo $! > $PID_FILE        # the pid into a holding file
gawk 'BEGIN { while ( 0 == getline php_pid < "pid" ) { system("sleep .1") } }
  { print "[" php_pid strftime(" %Y-%m-%d %H:%M:%S] "), $0 }' pipe_name >> $LOG_FILE
rm -f pid pipe_name        # clean up

I don’t like that this spawns a whole subshell for a tenth-second sleep, but it avoids loading another extension and having to be concerned whether this local system is above or below version 5.1… (c.f. the manual entry for details)

just a rewrite

On closer inspection, this solution is riddled with problems and I don’t recommend you use it.

First, it fails if the input has more than one string.

$: for x in one two three; do sleep 1; echo test $x; done | sed 's/^/date +"%F %T - "/e'
date: extra operand ‘one’
Try 'date --help' for more information.

date: extra operand ‘two’
Try 'date --help' for more information.

date: extra operand ‘three’
Try 'date --help' for more information.

I can fix that by putting the whole line into the date format:

$: for x in one two three; do sleep 1; echo test $x; done | sed -E 's/^(.*)/date +"%F %T - 1"/e'
2020-12-09 09:50:15 - test one
2020-12-09 09:50:16 - test two
2020-12-09 09:50:18 - test three

but that will corrupt any data it sees as a valid formatting directive.

$: for x in one two three; do sleep 1; echo "test %c $x"; done | sed -E 's/^(.*)/date +"%F %T - 1"/e'
2020-12-09 09:51:56 - test Wed, Dec 09, 2020  9:51:56 AM one
2020-12-09 09:51:57 - test Wed, Dec 09, 2020  9:51:57 AM two
2020-12-09 09:51:58 - test Wed, Dec 09, 2020  9:51:58 AM three

This method is really much kludgier than what you already had, as it’s spawning a date process for every line…

Adding in the PID is the only reason it becomes problematic, and your solution for that, while visually painful, really isn’t bad at all.

Taking another look… is the PID file really taking long enough to write that spawning the gawk reads an empty? This works fine for me:

$: (./tst 2>&1 & echo $! >pid ) | awk -v php_pid=$(<pid) '{
     print "[" php_pid strftime(" %Y-%m-%d %H:%M:%S] "), $0 }'

If you just want a slightly less confusing, more maintainable way to write basically the same logic –

{ ./tst 2>&1 & echo $! >pid; } | { declare -i val=0; until (( val )); do cnt=$(<pid); sleep .1; done; awk -v php_pid=$(<pid) '{print "[" php_pid strftime(" %Y-%m-%d %H:%M:%S] "), $0 }'; }

Switched to curly enclosure, works fine for this.

Or, using your values,

{ php $@ 2>&1 & echo $! > $PID_FILE; } |                 # run the job in bg, save PID
{ declare -i val=0;                                      # initialize a gatekeeper
  until (( val )); do cnt=$(<$PID_FILE); sleep .1; done; # wait till pid is loaded
  # now process  the data
  gawk -v php_pid=$(<$PID_FILE) '{ print "[" php_pid strftime(" %Y-%m-%d %H:%M:%S] "), $0 }' >> $LOG_FILE
}

original

Pipe it through sed with a expression substitution.

$: for x in one two three; do sleep 1; echo $x; done | sed 's/^/date +"%F %T - "/e'
2020-12-04 10:26:25 - one
2020-12-04 10:26:26 - two
2020-12-04 10:26:27 - three
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement