getweblogfiles.pl - A script to download and process logfiles.
getweblogfiles.pl [--configfile file] [--debug] [--help] [--little] [--section SectionName] [--template file] [--verbose] [--version]
getweblogfiles.pl is a perl script to automatically process different kinds of files of all kind e.g. web server or mail server logs. It uses a hash file to store the name, size, date and a check sum of all successfully processed files. This makes it possible to identify already processed and only work on newly added files, thus minizmizing network traffic.
This script works in 8 stages, some of which are optional. Only the stages 1, 2, 4 are mandatory.
Depending on the options selected, every logfile may be touched up to three times: On the preprocessing, on the processing and on the postprocessing stage.
The name, size and date of all files already processed are stored in a hash file. Hence getweblogfiles.pl is able to detect new files as well as files that have changed sind the last run of the script. That makes it possible to know when a logfile needs processing even if the name remains the same all the time.
During this stage getweblofiles.pl decompress, sort and concatenates new logfiles. Decompression starts automatically for .gz and .bz2 files. Other file types are not supported at present. The sort order can selected by the LogFileSortOrder.
The preprocessing stage is the frist of three processing stages. It can be used to e.g. resolve dns names. The options PreProcessingProgram and PreProcessingOptions control this stage.
The wild cards %TI and %TO can be used in PreProcessingOptions to refer to the input and output files.
After each stage, the contents of the output file will be copied back to the input file. So, %TI always contains the contents of the input file at the start of every main stage.
In order to omit this stage, make sure PreProcessingProgram is to
not_set
, otherwise the script stops with a error.
Note: This part is optional.
This is the main processing stage. The program and the options can selected by using ProcessingProgramm and ProcessingOptions. The selected program starts on every run of getweblogfiles.pl.
For more information see ``Preprocess files''.
This stage is optional as well. The program and the options can selected by using PostProcessingProgramm and PostProcessingOptions.
For more information see ``Preprocess files''.
Note: This part is optional.
If this is set, the logfiles being processed will be appended to the file given in DestinationLogFile.
Note: This part is optional.
If DeleteOldFTPLogs is set, getweblogfiles.pl removes the processed logfiles from the FTP server. If DeleteOldLocalLogs is set, it removes the local copies.
Note: This part is optional.
Set the ownership and permissions of all files and directories which are set in FilePermissions. It is possible to set it recursively for subdirectories.
Note: This part is optional.
This script do not use the environmental path settings. Please set every program with the full path.
This program uses these types of messages:
Display the state of the program and some other usefull information.
Display program internal data for debugging purposes.
Display warning messages. The program will continue.
Display error messages and the causes of the error if possible and then stops the script.
getweblogfiles.pl knows 3 ways to select the section used in the configuration file:
This one's taken ;-)
It uses the section that is not called General.
It is necessary to select the section with the option --section SectionName.
The script reads all necessary settings from a configuration file. You can create a sample of this configuration file using the option --template FileName.
After it starts getweblogfiles.pl searches for one of the following configuration files:
The first file found will be used.
If you want to use an different file, specify it with --configfile FileName.
To simplify the usage of some options, getweblogfiles.pl understands the follow wild cards:
You can use it in all path, filename, program and mail settings.
Short configuration to analyse the HTTP logfiles of the two domains example.com and test.com using AWStats.
[General] # # Exclude ftp logs always ExcludePattern=ftp_log # # Don't copy the content of %TO back to %TI, because we want to # append the entries to a huge logfile specified with # "DestinationLogFile" in the later sections. ProcessingSwapTempFiles=no
[example] FTPServer=example.com FTPUser=my_special_ftp_user FTPPassword=top_secret HashFile=~/getweblogfiles.d/example.com.hash LocalFileDirectory=~/logs/example.com/ ProcessingProgram=/usr/lib/cgi-bin/awstats.pl ProcessingOptions=-config=example.com -update -showsteps -logfile=%TI DestinationLogFile=~/logs/example.com/example_%Y.de
[test] FTPServer=test.com FTPUser=my_special_ftp_user FTPPassword=top_secret HashFile=~/getweblogfiles.d/test.com.hash LocalFileDirectory=~/logs/test.com/ ProcessingProgram=/usr/lib/cgi-bin/awstats.pl ProcessingOptions=-config=test.com -update -showsteps -logfile=%TI DestinationLogFile=~/logs/test.com/test_%Y.de
The script can send you the configured output per mail to an email account. It uses the sendmail daemon to send the mail.
For more informations about the settings, see the MailSender, MailReceiver, MailSubject and SendMail option in the configuration file.
Return values and their meanings:
Warning messages:
The script stops the processing on these conditions. After fixing the described error, the script will runs fine.
The parameter given with the option DestinationLogFile is not a regular file. It could be a link or a directory. Please check it and set this configuration entry to an existing file or use a non-existing name and allow for a new file to be created.
The option LocalFileDirectory contains a file system entry, but is is not a directory. Please check the configuration file and set it to an existing directory or a new one which will be created.
You have select a preprocessing program, but the program does not exists. Or
you have not the permissions to execute the program. Please check the entry
PreProcessingProgram or set it to not_set
.
You have not selected a main program or the program does not exists. Or you have not the permissions to execute the program. Please check the entry ProcessingProgram and set it to an existing program.
You have select a postprocessing program, but the program does not exists. Or
you have not the permsissions to execute the program. Please check the entry
PostProcessingProgram or set it to not_set
.
You try to write a sample configuration to an existing file name. This is not possible. Please try it again with an new (non-existing) file name.
The configuration file contains sensitive data like ftp passwords. This file should be only readable and writeable by the user of this script. There is no need to have read and write permissions for groups or others! Remove these permissions and try again.
The configuration file contains more than 2 sections. Please specify the section used with the commandline option --section.
One or more commandline options are wrong. Perhaps a typo? Please correct these options and try it again.
Error messages:
This error occurs when copying internal data between the temporary files. Please check the error message, correct the error and try again.
This error occurs during the execution of some programs. Please check the error message, fix the error and try again.
This occurs if ther is a problem changing into the specified directory on the ftp server. Perhaps the specified directory does not exist? Please check your settings of the option FTPDirectory and try again.
Please check the settings for the ftp connection, i.e. username, password and server and try again.
Please check the ftp details specially passive mode settings and try again.
The script has not found a configuration file. Please select a configuration file manually with the commandline option --configfile.
Please check the spelling of the section name and the content of the configuration file and try again.
A general error during reading the configuration file is occured. Please check the error message, fix the error and try again.
Please check the error message, fix the error and try again.
Please check the error message, fix the error and try again.
Please check the error message, fix the error and try again.
Please check the error message, fix the error and try again.
Please check the error message, fix the error and try again.
If the directory set as LocalFileDirectory doesn't exist, the script tries to create it. This seems to have failed. Please check the error message. Maybe you have tried to create two or more levels of directories at once. In that case you should create these directories manually.
Please check the error message, fix the error and try again.
getweblogfiles.pl knows three debug levels:
e.g. output of the exec()ed scripts and internal loop sequences
Getweblogfiles.pl uses File::Temp
to create secure temporary files.
It also uses umask
to set the file permissions of the downloaded
logfiles. If you want to use a umask setting other than 077
, have a look at
the description Umask option in the configuration file. Is recommended to
use the FilePermissions option instead.
The user of this script can read the passwords of the ftp server and all logs. Be careful! It highly recommended no to run this script with the permissions of the webserver user.
Please send bug reports to mail_for_carstengrohmann.de.
You can find the latest version of this script at: http://www.carstengrohmann.de
fixed some typos.
getweblogfiles.pl
by Carsten Grohmann mail_for_carstengrohmann.de
Version: 2.3
Build Date: 2007-06-23
$Id: getweblogfiles.pl,v 1.25 2007-06-23 17:07:03 carsten Exp $
Copyright (c) 2003-2007 Carsten Grohmann
This package is free software, you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the license, or any later version
Please refer to http://www.fsf.org/licenses/licenses.html for details.
#!/usr/bin/perl # # Author : Carsten Grohmann # # Licence : GPL # # Version : $Id: getweblogfiles.pl,v 1.25 2007-06-23 17:07:03 carsten Exp $ # # History : # # $Log: getweblogfiles.pl,v $ # Revision 1.25 2007-06-23 17:07:03 carsten # - add own cat() implementation (and remove dependencies of File::Cat) # - version 2.3 # # Revision 1.24 2007-06-20 15:00:02 carsten # - fix different namings of FTPDirectory # - exit script of the configuration section doesn't exist # - some small internal improvements and changes # - add example to documentation # - version 2.2 # # Revision 1.23 2007-06-19 09:26:10 carsten # - fix typo # # Revision 1.22 2007-06-19 09:07:42 carsten # - don't send emails per default # - change classification of some DEBUG messages to INFO messages # - set forgotten FTPTimeout # - add missing newline to some messages # - add additional INFO messages # # Revision 1.21 2007-06-18 19:30:00 carsten # - add prefix "Debug:" to all debug messages # - add debuglevel 3 to show FTP internal commands and messages # - fix wrong checks of PostProcessingProgram binary # # Revision 1.20 2007-06-18 08:44:10 carsten # - fix wrong operator # # Revision 1.19 2007-06-16 16:40:49 carsten # - add not full implemented passive FTP mode handling # - version 2.1 # # Revision 1.18 2005/12/19 19:00:01 carsten # - version 2.0c # - add option --man to show the internal manpage directly # # Revision 1.17 2004/12/30 12:44:13 carsten # - version 2.0b # - decrease the debug level of two "Skip..." messages to reduce output # # Revision 1.16 2004/10/23 20:24:56 carsten # - Version 2.0 # - change only date and version # # Revision 1.15 2004/09/27 17:02:55 carsten # - Version 2.0rc2 # - sort log file entries by date and time (Option SortLogFile) # - remove using $_ # - store md5 hash also to catch touched but unchanged log files # # Revision 1.14 2004/07/31 15:12:39 carsten # - Version 2.0rc1 # - set default settings to stricter values # - add IncludePattern and ExcludePattern to check filenames # - add FilePermissions to set the ownership and the permissions of # downloaded and processed files # - PreProcessProgram and PostProcessProgram have to set "not_set" # to disable # - small changes in the documentation # # Revision 1.13 2004/04/25 22:35:11 carsten # - Version 2.0beta4 # - enhanced english documentation # - fix bug: checking content of "SendMail" # - change test of existing program from -e to -x # - other small corrections # # Revision 1.12 2004/04/22 19:32:40 carsten # - Version 2.0beta3 # - komplett und flexibler neugeschrieben # - besser dokumentiert # # # Note: # - This script is formated with perltidy -bbvt=0 -bl use strict; use diagnostics; use Fcntl ':mode'; use Config::IniFiles; use Digest::MD5; use File::Copy; use File::Listing; use File::Temp; use Getopt::Long; use Mail::Mailer; use Net::FTP; use Pod::Usage; my @messages; # to save program messages my %param = ( # default values of all configuration # parameters 'DEBUGLEVEL' => '0', 'SECTION' => 'General', 'VERSION' => '2.3', 'DebugLevel' => 0, 'DeleteOldFTPLogs' => 'no', 'DeleteOldLocalLogs' => 'no', 'DestinationLogFile' => 'none', 'ExcludePattern' => 'not_set', 'FilePermissions' => ['not_set'], 'FTPDirectory' => 'logs', 'FTPPassive' => 'yes', 'FTPPassword' => 'mysecret', 'FTPServer' => 'my.domain.tld', 'FTPTimeout' => 120, 'FTPUser' => 'MyFTPUser', 'HashFile' => '~/.getweblogfiles.hash', 'LITTLE' => 'no', 'IncludePattern' => 'not_set', 'LocalFileDirectory' => '~/getweblogfiles.d', 'LogFileSortOrder' => 'none', 'MailSender' => 'root', 'MailReceiver' => 'root', 'MailSubject' => 'Message of getweblogfiles.pl', 'PreProcessingSwapTempFiles' => 'yes', 'PreProcessingProgram' => 'not_set', 'PreProcessingOptions' => '', 'ProcessingSwapTempFiles' => 'yes', 'ProcessingProgram' => '/path/to/awstats.pl', 'ProcessingOptions' => '-config=myconfig -update -showsteps -logfile=%TI', 'PostProcessingSwapTempFiles' => 'yes', 'PostProcessingProgram' => 'not_set', 'PostProcessingOptions' => '', 'ShowDebugMessages' => 'no', 'ShowMessages' => 'yes', 'ShowWarnings' => 'yes', 'SendMail' => 'none', 'SortLogFile' => 'yes', 'Umask' => '077' ); my %returnvalue = ( 'successfull' => 0, # no error 'warning_destinationlogfile' => 1, # warning destination logfile exists # but so not a regular file 'warning_localdirectory' => 2, # warning local directory exists but is # not a directory 'warning_nopreprocessingprogram' => 3, # preprocessing program not found 'warning_nomainprogram' => 4, # warning main program availably 'warning_nopostprocessingprogram' => 5, # postprocessing program not found 'warning_templateexists' => 6, # warning template configuration file # exists 'warning_configfileperms' => 7, # warning the config file is readable # by group and world! 'warning_config_nosection_selected' => 8, # no section selected 'warning_wrongsyntax' => 9, # wrong command line option 'error_copytempfiles' => 10, # error during copy temporary files 'error_execprogram' => 11, # error on exec program 'error_ftp_cd' => 12, # error during cd on the ftp server 'error_ftp_login' => 13, # error on ftp login 'error_ftp_get_failed' => 14, # error on ftp get command 'error_config_nofile' => 15, # error no configuration file found 'error_config_nosection' => 16, # error requested section not in config file 'error_config_read' => 17, # error during read configuration file 'error_noftpconnection' => 18, # error during open ftp connetion 'error_openhashfile' => 19, # error during open hash file 'error_postprocessingprogram' => 20, # error in post processing program 'error_preprocessingprogram' => 21, # error in pre processing program 'error_processingprogramm' => 22, # error in main process 'error_createlocaldirectory' => 23, # error during create LocalFileDirecory 'error_concatenate_opensource' => 24 # error on opening source file to # concatenate ); my %oldlogfiles; my %newlogfiles; my $tempinput_handle; # handle of the temporary inputfile my $tempinput_name; # filename of the temporary inputfile my $tempoutput_handle; # handle of the temporary outputfile my $tempoutput_name; # filename of the temporary outputfile my @configparams = ( # subset configuration parameters which # is read from and write to the # configuration file "DebugLevel", "DeleteOldFTPLogs", "DeleteOldLocalLogs", "DestinationLogFile", "ExcludePattern", "FTPDirectory", "FTPPassive", "FTPPassword", "FTPServer", "FTPTimeout", "FTPUser", "HashFile", "IncludePattern", "LocalFileDirectory", "LogFileSortOrder", "MailSender", "MailReceiver", "MailSubject", "PreProcessingOptions", "PreProcessingProgram", "PreProcessingSwapTempFiles", "ProcessingOptions", "ProcessingProgram", "ProcessingSwapTempFiles", "PostProcessingOptions", "PostProcessingProgram", "PostProcessingSwapTempFiles", "SendMail", "ShowDebugMessages", "ShowMessages", "ShowWarnings", "SortLogFile", "Umask" ); my %months = ( # hash with months to translate names to number "Jan" => "01", "Feb" => "02", "Mar" => "03", "Apr" => "04", "May" => "05", "Jun" => "06", "Jul" => "07", "Aug" => "08", "Sep" => "09", "Oct" => "10", "Nov" => "11", "Dec" => "12" ); #----------------------------------------------------------------------- # prints a debug message # Hint: # if level not defined it will set to 1 # parameters: # 1. message # 2. debug level # return value: none #----------------------------------------------------------------------- sub DEBUG ($$) { my $level = $_[0]; my $msg = "DEBUG: " . $_[1]; $level = 1 if ( !defined $level ); if ( $level <= $param{'DEBUGLEVEL'} ) { print $msg; push( @messages, $msg ); } } #----------------------------------------------------------------------- # prints an program information #----------------------------------------------------------------------- sub INFO ($) { my $msg = $_[0]; if ( $param{'ShowMessages'} eq "yes" ) { print $msg; push( @messages, $msg ); } } #----------------------------------------------------------------------- # prints an error message #----------------------------------------------------------------------- sub ERROR ($) { my $msg = "ERROR: " . $_[0]; print $msg; push( @messages, $msg ); } #----------------------------------------------------------------------- # prints an warning #----------------------------------------------------------------------- sub WARNING ($) { my $msg = "WARNING: " . $_[0]; if ( $param{'ShowWarnings'} eq "yes" ) { print $msg; push( @messages, $msg ); } } #----------------------------------------------------------------------- # exit the the script # parameter: # optional: exit code # return value: none #----------------------------------------------------------------------- sub EXIT ($) { my $returnvalue = $_[0]; if ( !defined $returnvalue ) { $returnvalue = 0; } # delete not compelete processed logfiles if (%newlogfiles) { foreach my $file ( keys %newlogfiles ) { if ( -f $newlogfiles{$file}->{'file'} ) { unlink $newlogfiles{$file}->{'file'}; } } } # delete temporary files if ( defined $tempinput_name ) { unlink($tempinput_name); } if ( defined $tempoutput_name ) { unlink($tempoutput_name); } # send mail if wished if ( ( $param{'SendMail'} eq 'always' ) || ( ( $param{'SendMail'} eq 'onerror' ) && ( $returnvalue != 0 ) ) ) { DEBUG( 1, "Send state mail\n" ); my $mailer = Mail::Mailer->new('sendmail'); my %mailheader = ( "To" => $param{'MailReceiver'}, "From" => $param{'MailSender'}, "Subject" => $param{'MailSubject'} ); push( @messages, "Returnvalue of this script: $returnvalue\n" ); $mailer->open( \%mailheader ); print $mailer @messages; $mailer->close(); } # finish script exit $returnvalue; } #----------------------------------------------------------------------- # Append a file to another file # parameter: # 1. name of the source file # 2. file handle of the file to append # return value: # 0 on error # 1 on success #----------------------------------------------------------------------- sub cat($$) { my ( $input, $handle ) = @_; my ( $source, $dest ) = @_; if ( !open( INFH, $source ) ) { ERROR("Can't open source file \"$input\" to read content: $!\n"); EXIT( $returnvalue{'error_concatenate_opensource'} ); } while ( my $line = <INFH> ) { print $dest $line; } close(INFH); return 1 } #----------------------------------------------------------------------- # replace wild card in file and directory names # parameter: file/path name # return value: processed file/path name #----------------------------------------------------------------------- sub replace_wildcards($) { my ($name) = $_[0]; if ( !defined $name ) { return undef; } if ( $name eq "" ) { return $name; } # replace ~ if ( $name =~ m!^~/! ) { $name =~ s|^~/|$ENV{'HOME'}/|g; } # replace %D if ( $name =~ m/%D/ ) { $name =~ s/%D/$param{'DAY'}/g; } # replace %M if ( $name =~ m/%M/ ) { $name =~ s/%M/$param{'MONTH'}/g; } # replace %Y if ( $name =~ m/%Y/ ) { $name =~ s/%Y/$param{'YEAR'}/g; } # replace %y if ( $name =~ m/%y/ ) { $name =~ s/%y/$param{'YEAR_SHORT'}/g; } # replace %TI if ( $name =~ m/%TI/ ) { $name =~ s/%TI/$tempinput_name/g; } # replace %TO if ( $name =~ m/%TO/ ) { $name =~ s/%TO/$tempoutput_name/g; } # replace %h if ( $name =~ m/%h/ ) { $name =~ s/%h/$param{'FQDN'}/g; } return $name; } #----------------------------------------------------------------------- # load the configuration from an file # parameters # 1. filename of the configuration file # 2. configuration section #----------------------------------------------------------------------- sub load_config($$) { my ( $filename, $section ) = @_; my $config = new Config::IniFiles( -file => $filename, -default => "General" ); if ( !defined($config) ) { ERROR( "by reading the configuration file \"" . $filename . "\"\n" . "Error message:\n" . join( "\n", @Config::IniFiles::errors ) ); EXIT( $returnvalue{'error_config_read'} ); } # check multi sections if ( ( scalar $config->Sections() > 2 ) && ( $section eq "General" ) ) { ERROR( "The configuration file contains more than 2 sections.\n" . "Please specify the used section with the commandline option " . "--section.\n" ); EXIT( $returnvalue{'warning_config_nosection_selected'} ); } # select section if ( ( scalar $config->Sections() == 2 ) && ( $section eq "General" ) ) { foreach my $value ( $config->Sections() ) { if ( $value ne "General" ) { $section = $value; last; } } } if ( !$config->SectionExists($section) ) { ERROR( sprintf( "The given section \"%s\" doesn't exists.\n" . "Please specify an existing section with the commandline " . "option --section.\n", $section ) ); EXIT( $returnvalue{'error_config_nosection'} ); } foreach my $value (@configparams) { if ( defined( $config->val( $section, $value ) ) ) { $param{$value} = $config->val( $section, $value ); } } if ( defined( $config->val( $section, "FilePermissions" ) ) ) { @{ $param{'FilePermissions'} } = $config->val( $section, "FilePermissions" ); } # replace wildcards $param{'HashFile'} = replace_wildcards( $param{'HashFile'} ); $param{'LocalFileDirectory'} = replace_wildcards( $param{'LocalFileDirectory'} ); $param{'MailSender'} = replace_wildcards( $param{'MailSender'} ); $param{'MailReceiver'} = replace_wildcards( $param{'MailReceiver'} ); $param{'MailSubject'} = replace_wildcards( $param{'MailSubject'} ); $param{'PreProcessingProgram'} = replace_wildcards( $param{'PreProcessingProgram'} ); $param{'PreProcessingOptions'} = replace_wildcards( $param{'PreProcessingOptions'} ); $param{'ProcessingProgram'} = replace_wildcards( $param{'ProcessingProgram'} ); $param{'ProcessingOptions'} = replace_wildcards( $param{'ProcessingOptions'} ); $param{'PostProcessingProgram'} = replace_wildcards( $param{'PostProcessingProgram'} ); $param{'PostProcessingOptions'} = replace_wildcards( $param{'PostProcessingOptions'} ); $param{'DestinationLogFile'} = replace_wildcards( $param{'DestinationLogFile'} ); # convert to lower case $param{'FTPPassive'} = lc( $param{'FTPPassive'} ); $param{'DeleteOldFTPLogs'} = lc( $param{'DeleteOldFTPLogs'} ); $param{'DeleteOldLocalLogs'} = lc( $param{'DeleteOldLocalLogs'} ); $param{'LogFileSortOrder'} = lc( $param{'LogFileSortOrder'} ); $param{'PreProcessingSwapTempFiles'} = lc( $param{'PreProcessingSwapTempFiles'} ); $param{'ProcessingSwapTempFiles'} = lc( $param{'ProcessingSwapTempFiles'} ); $param{'PostProcessingSwapTempFiles'} = lc( $param{'PostProcessingSwapTempFiles'} ); $param{'SendMail'} = lc( $param{'SendMail'} ); $param{'ShowDebugMessages'} = lc( $param{'ShowDebugMessages'} ); $param{'ShowMessages'} = lc( $param{'ShowMessages'} ); $param{'ShowWarnings'} = lc( $param{'ShowWarnings'} ); $param{'SortLogFile'} = lc( $param{'SortLogFile'} ); $param{'Umask'} = lc( $param{'Umask'} ); if ( lc( $param{'DestinationLogFile'} ) eq "none" ) { $param{'DestinationLogFile'} = lc( $param{'DestinationLogFile'} ); } if ( lc( $param{'FilePermissions'}->[0] ) eq "not_set" ) { $param{'FilePermissions'}->[0] = lc( $param{'FilePermissions'}->[0] ); } if ( $param{'ShowDebugMessages'} eq "yes" ) { $param{'DEBUGLEVEL'} = $param{'DebugLevel'}; } if ( ( $param{'SendMail'} ne "always" ) && ( $param{'SendMail'} ne "none" ) && ( $param{'SendMail'} ne "onerror" ) ) { $param{'SendMail'} = 'onerror'; } } #----------------------------------------------------------------------- # executes an program # parameter: program name and all option as scalar # return value: return value of the executed script #----------------------------------------------------------------------- sub my_exec($) { my $program = $_[0]; my $line; DEBUG( 1, "Execute $program\n" ); if ( !open( PROGRAM, "$program" ) ) { ERROR("Can't start $program return message: $!\n"); EXIT( $returnvalue{'error_execprogram'} ); } DEBUG( 2, "Begin of program output:\n" ); while ( defined( $line = <PROGRAM> ) ) { DEBUG( 2, "$line" ); } DEBUG( 2, "End of program output\n" ); close(PROGRAM); DEBUG( 1, "Return value of the executed program: $?\n" ); return $?; } #----------------------------------------------------------------------- # read name, date and size of the processed logfiles from a hash file # parameter: filename # return value: none #----------------------------------------------------------------------- sub load_logfiles($) { my $filename = $_[0]; my ( $name, $date, $size, $md5 ); if ( !-e $filename ) { return; } if ( !open( INFH, "<$filename" ) ) { ERROR("Can't open hash file \"$filename\" to read: $!\n"); EXIT( $returnvalue{'error_openhashfile'} ); } while ( my $line = <INFH> ) { chomp $line; ( $name, $date, $size, $md5 ) = split( /:/, $line ); $oldlogfiles{$name}->{'date'} = $date; $oldlogfiles{$name}->{'size'} = $size; if ( !defined $md5 ) { $oldlogfiles{$name}->{'md5'} = "unknown"; } else { $oldlogfiles{$name}->{'md5'} = $md5; } } close(INFH); } #----------------------------------------------------------------------- # write name, date and size of the processed logfiles in an hash file # parameter: filename # return value: none #----------------------------------------------------------------------- sub save_logfiles($) { my $filename = $_[0]; if ( !open( OUTFH, ">$filename" ) ) { ERROR("Can't open hash file \"$filename\" to write: $!\n"); EXIT( $returnvalue{'error_openhashfile'} ); } foreach my $file ( sort ( keys %oldlogfiles ) ) { next if ( $file eq "" ); print OUTFH "$file:$oldlogfiles{$file}->{'date'}:$oldlogfiles{$file}->{'size'}:" . "$oldlogfiles{$file}->{'md5'}\n"; } close(OUTFH); # remove empty file if ( -z $filename ) { unlink $filename; } } #----------------------------------------------------------------------- # write a template configuration file # parameter: filename # return value: none #----------------------------------------------------------------------- sub write_template($) { my $filename = $_[0]; if ( -e $filename ) { ERROR( "The template file $filename exists. Please remove\n" . "it first before you create a new file with the same name.\n" ); EXIT( $returnvalue{'warning_templateexists'} ); } my $config_templ = new Config::IniFiles(); $config_templ->AddSection("General"); $config_templ->AddSection("Project1"); $config_templ->AddSection("Project2"); foreach my $value (@configparams) { $config_templ->newval( "General", $value, $param{$value} ); } $config_templ->newval( "General", "FilePermissions", $param{'FilePermissions'}->[0] ); # write comments in the template file if ( $param{'LITTLE'} ne "yes" ) { $config_templ->SetSectionComment( "General", "Der Abschnitt [General] kann alle Einstellungen enthalten. Die in", "der Vorlage verwendeten Werte sind die Standardeinstellungen des", "Programms.", "Für alle Pfade, Dateinamen, Programm- und Mailoptionen sind", "folgende Ersetzungen möglich:", " %D laufende Tag des Monats z.B. 29", " %M aktuelle Monat z.B. 12", " %Y aktuelle Jahr 4-ziffrig z.B. 2004", " %y aktuelle Jahr 2-ziffrig z.B. 04", " %h volle qualifizierte Domänenname des lokalen Rechners", " %TI temporäre Quelldatei", " %TO temporäre Zieldate ", "Hinweis:", " Der Inhalt der temporären Zieldatei wird nach jedem Schritt", " zurück in die temporäre Quelldatei kopiert.", "", "The [General] section can contains all settings. The values in", "this template are the default values of this script.", "This script used follow substitutions for all path, filename,", "program and mail settings:", " %D Day of month e.g. 29", " %M current month e.g. 12", " %Y current year 4 digits e.g. 2004", " %y current year 2 digits e.g. 04", " %h fully qualified domain name of the local computer", " %TI temporary source file", " %TO temporary destination file", "Note:", " After each step the content of the temporary destination file", " will copied back to the temporary source file." ); $config_templ->SetSectionComment( "Project1", "In jeden Projekt können die gleichen Parameter wie unter", "[General] verwendet werden. Die projektspezifischen Einstellungen", "überschreiben die allgemeinen Einstellungen.", "You can use all options from [General] in this project.", "The project specific settings overwite the general settings." ); $config_templ->SetSectionComment( "Project2", "In jeden Projekt können die gleichen Parameter wie unter", "[General] verwendet werden. Die projektspezifischen Einstellungen", "überschreiben die allgemeinen Einstellungen.", "You can use all options from [General] in this project.", "The project specific settings overwite the general settings." ); $config_templ->SetParameterComment( "General", "FTPDirectory", "", "das Unterzeichnis mit den Logdateien", "the logfile sub directory", "Default: " . $param{'FTPDirectory'} ); $config_templ->SetParameterComment( "General", "FTPServer", "", "der FTP-Server", "the FTP-Server", "Default: " . $param{'FTPServer'} ); $config_templ->SetParameterComment( "General", "FTPUser", "", "kein Kommentar :-)", "no comment :-)", "Default: " . $param{'FTPUser'} ); $config_templ->SetParameterComment( "General", "FTPPassword", "", "kein Kommentar :-)", "no comment :-)", "Default: " . $param{'FTPPassword'} ); $config_templ->SetParameterComment( "General", "FTPTimeout", "", "Setzt die Verfallszeit der FTP-Verbindungen", "Set the timeout value for ftp connections", "Default: " . $param{'FTPTimeout'} ); $config_templ->SetParameterComment( "General", "FTPPassive", "", "Passives FTP benutzen (yes oder no)", "Use passive ftp (yes or no)", "Default: " . $param{'FTPPassive'} ); $config_templ->SetParameterComment( "General", "HashFile", "", "Datei mit der Größe und dem Datum der schon verarbeiteten", "Logdateien.", "File with the size and date of the processed logfiles", "Default: " . $param{'HashFile'} ); $config_templ->SetParameterComment( "General", "DebugLevel", "", "Detailgrad der Debugmeldungen (0=keine, 4=alle)", "Detail level of the debug messages (0=none; 4=all)", "Default: " . $param{'DEBUGLEVEL'} ); $config_templ->SetParameterComment( "General", "LocalFileDirectory", "", "Verzeichnis zum Speichern der Logdateien", "Directory to save the logfiles in it", "Default: " . $param{'LocalFileDirectory'} ); $config_templ->SetParameterComment( "General", "LogFileSortOrder", "", "Sortierreihenfolge der Logdateien nach:", " timestamp - dem Zeitstempel der Datei", " ascending - Dateinamen aufsteigend sortiert", " descending - Dateinamen absteigend sortiert", " none - Logdateien werden nicht sortiert", "Sortorder of the logfiles by:", " timestamp - by the timestamp of the logfile", " ascending - by the filenames ascending", " descending - by the filenames descending", " none - logfiles will not sorted", "Default: " . $param{'LogFileSortOrder'} ); $config_templ->SetParameterComment( "General", "PreProcessingProgram", "", "Programm zum Vorbearbeiten der Logdateien", "Program to pre process the logfiles", "Default: " . $param{'PreProcessingProgram'} ); $config_templ->SetParameterComment( "General", "PreProcessingOptions", "", "Optionen für das Vorverarbeiten der Logdateien", "Options for the logfile preprocessing", "Default: " . $param{'PreProcessingOptions'} ); $config_templ->SetParameterComment( "General", "PreProcessingSwapTempFiles", "", "Kopiert nach dem Vorbearbeiten der Logdateien den Inhalt von %TO", "zurück nach %TI.", "Copied the content of %TO back to %TI after the preprocessing", "stage.", "Default: " . $param{'PreProcessingSwapTempFiles'} ); $config_templ->SetParameterComment( "General", "ProcessingProgram", "", "Programm zum Bearbeiten der Logdateien", "Program to process the logfiles", "Default: " . $param{'ProcessingProgram'} ); $config_templ->SetParameterComment( "General", "ProcessingOptions", "", "Optionen für das Verarbeiten der Logdateien", "Options for the logfile processing", "Default: " . $param{'ProcessingOptions'} ); $config_templ->SetParameterComment( "General", "ProcessingSwapTempFiles", "", "Kopiert nach dem Bearbeiten der Logdateien den Inhalt von %TO", "zurück nach %TI.", "Copied the content of %TO back to %TI after the processing stage.", "Default: " . $param{'ProcessingSwapTempFiles'} ); $config_templ->SetParameterComment( "General", "PostProcessingProgram", "", "Programm zum Nachbearbeiten der Logdateien", "Program to post process the logfiles", "Default: " . $param{'PostProcessingProgram'} ); $config_templ->SetParameterComment( "General", "PostProcessingOptions", "", "Optionen für das Nachbearbeiten der Logdateien", "Options for the logfile post processing", "Default: " . $param{'PostProcessingOptions'} ); $config_templ->SetParameterComment( "General", "PostProcessingSwapTempFiles", "", "Kopiert nach dem Nachbearbeiten der Logdateien den Inhalt von %TO", "zurück nach %TI.", "Copied the content of %TO back to %TI after the postprocessing", "stage.", "Default: " . $param{'PostProcessingSwapTempFiles'} ); $config_templ->SetParameterComment( "General", "DestinationLogFile", "", "Datei an die die soeben verarbeiteten Einträge angehangen", "werden. Die Einträge werden immer aus %TI gelesen.", "\"none\" verhindert das Anhängen", "File to append the processed logfiles. The source is the", "%TI file.", "\"none\" prevent appending log entries", "Default: " . $param{'DestinationLogFile'} ); $config_templ->SetParameterComment( "General", "DeleteOldFTPLogs", "", "Alte Logdateien auf dem FTP-Server löschen", "Remove old logs from the ftp server", "Default: " . $param{'DeleteOldFTPLogs'} ); $config_templ->SetParameterComment( "General", "DeleteOldLocalLogs", "", "Alte lokale Logdateien löschen", "Remove old local logs", "Default: " . $param{'DeleteOldLocalLogs'} ); $config_templ->SetParameterComment( "General", "ShowDebugMessages", "", "Debugmeldungen anzeigen (yes oder no)", "Show debug messages (yes or no)", "Default: " . $param{'ShowDebugMessages'} ); $config_templ->SetParameterComment( "General", "ShowMessages", "", "Programmmeldungen anzeigen (yes oder no)", "Show program messages (yes or no)", "Default: " . $param{'ShowMessages'} ); $config_templ->SetParameterComment( "General", "ShowWarnings", "", "Warnungen anzeigen (yes oder no)", "Show warnings (yes or no)", "Default: " . $param{'ShowWarnings'} ); $config_templ->SetParameterComment( "General", "Umask", "", "Umask für alle neu angelegten Dateien.", "Nutzen Sie \"none\" wenn keine Umask gesetzt werden soll.", "The umask used for new files", "If you don't want to set the umask use \"none\".", "Default: " . $param{'Umask'} ); $config_templ->SetParameterComment( "General", "MailSender", "", "Absender der Statusmail.", "Sender of the state mail", "Default: " . $param{'MailSender'} ); $config_templ->SetParameterComment( "General", "MailReceiver", "", "Empfänger der Statusmail.", "Receiver of the state mail", "Default: " . $param{'MailReceiver'} ); $config_templ->SetParameterComment( "General", "MailSubject", "", "Betreff der Statusmail.", "Subject of the state mail", "Default: " . $param{'MailReceiver'} ); $config_templ->SetParameterComment( "General", "SendMail", "", "Schalter um die Statusmail zu versenden.", "Möglich sind folgende drei Einstellungen:", " onerror - versendet eine Mail im Falle eines Fehlers", " always - die Mail wird immer versandt", " none - es wird nie eine Mail versandt", "Switch to activate the state mailing", "Follow three settings are possible:", " onerror - send a mail in case of an error", " always - send always a mail", " none - send never a mail", "Default: " . $param{'SendMail'} ); $config_templ->SetParameterComment( "General", "FilePermissions", "", "Setzt die Berechtigungen und Eigentümer von Dateien und Ver-", "zeichnissen. \"not_set\" deaktiviert diese Funktion. Es können", "mehrere Einträge verwendet werden. Jeder Eintrag sollte in einer", "eigenen Zeile stehen und folgendes Format haben:", "Datei_Verzeichnis:Eigentümer.Gruppe:Berechtigungen:recursive", "\"recursive\" aktiviert die rekursive Verarbeitung von Ver-", "zeichnissen. Alternative kann \"recursive\" auch entfallen oder", "durch \"not_set\" ersetzt werden, um Verzeichnisse nicht re-", "kursiv zu bearbeiten. Solle ein Wert nicht geändert werden soll,", "kann dies mit \"not_set\" angezeigt werden.", "Set the ownership and the permissions of files and directories.", "\"not_set\" deactivates this function.", "It is possible to use several entries. Every entry should has", "his own line and use the follow format:", "File_Directory:Owner.Group:Permissions:recursive", "\"recursive\" activates the recursive directory processing. To", "deactivate this use \"not_set\" or dropped \":recursive\".", "Using \"not_set\" is also possibly to deactivate the other parts.", "Beispiel / Example :", "FilePermissions=<<EOT", "/var/lib/awstats:wwwrun.root:0700:recusive", "/var/lib/wwwlogs/mylog:not_set:0700", "EOT", "Default: " . $param{'FilePermissions'}->[0] ); $config_templ->SetParameterComment( "General", "ExcludePattern", "", "Alle Dateinamen mit diesem Muster werden ignoriert.", "Als Muster können normale Zeichenketten oder reguläre", "Ausdrücke verwendet werden.", "Ignore all file that filenames matchs this pattern", "You can use strings or regular expressions as pattern.", "Default: " . $param{'ExcludePattern'} ); $config_templ->SetParameterComment( "General", "IncludePattern", "", "Alle Dateiname mit diesem Muster werden herunter geladen.", "Als Muster können normale Zeichenketten oder reguläre", "Ausdrücke verwendet werden.", "Download all file that filenames matchs this pattern", "You can use strings or regular expressions as pattern.", "Default: " . $param{'IncludePattern'} ); $config_templ->SetParameterComment( "General", "SortLogFile", "", "Sortiert die Einträge der zusammengefügten Logdatei aufsteigend", "nach dem Datum", "Sort the entries of the concatenated logfile ascending by date.", "Default : " . $param{'SortLogFile'} ); } # write configuration to file $config_templ->WriteConfig($filename); INFO( sprintf( "Configuration template file \"%s\" successfull written.\n", $filename ) ); EXIT( $returnvalue{'successfull'} ); } #----------------------------------------------------------------------- # copy temp file output file into the temp file input # parameters: none # return value: none #----------------------------------------------------------------------- sub copytempfiles() { DEBUG( 1, "Copy temp output back to temp input\n" ); # set file position back to start seek( $tempinput_handle, 0, 0 ); seek( $tempoutput_handle, 0, 0 ); # copy temp output to temp input if ( !copy( $tempoutput_name, $tempinput_name ) ) { ERROR("During copy tempfiles: $!\n"); EXIT( $returnvalue{'error_copytempfiles'} ); } # set file position back to start seek( $tempinput_handle, 0, 0 ); seek( $tempoutput_handle, 0, 0 ); # empty temp output file truncate( $tempoutput_handle, 0 ); } #----------------------------------------------------------------------- # open an ftp connection, login and change to log directory # parameters: # 1. ftp server # 2. ftp user # 3. password # 4. directory # return value: # ftp object #----------------------------------------------------------------------- sub startftp($$$$) { my ( $server, $user, $password, $directory ) = @_; my $ftp; my $ftp_debug_flag = 0; my $ftp_passive_flag = 0; INFO("Initialise FTP connection\n"); DEBUG( 2, " server: $server\n" ); DEBUG( 2, " ftpuser: $user\n" ); DEBUG( 2, " ftp diectory: $directory\n" ); if ( $param{'DEBUGLEVEL'} >= 3 ) { $ftp_debug_flag = 1; DEBUG( 2, "Enabling FTP debug information\n" ); } if ( $param{'FTPPassive'} eq "yes" ) { $ftp_passive_flag = 1; DEBUG( 2, "Enable FTP passive mode\n" ); } # create a new ftp object $ftp = Net::FTP->new( $server, Debug => $ftp_debug_flag, Passive => $ftp_passive_flag, Timeout => $param{'FTPTimeout'} ); if ( !defined $ftp ) { ERROR("Can't open ftp connection to $server: $!\n"); EXIT( $returnvalue{'error_noftpconnection'} ); } if ( !$ftp->login( $user, $password ) ) { ERROR("Login for user $user on ftp server $server failed: $!\n"); $ftp->quit; EXIT( $returnvalue{'error_ftp_login'} ); } # change directory if ( !$ftp->cwd($directory) ) { ERROR("Can't change to directory $directory: $!\n"); $ftp->quit; EXIT( $returnvalue{'error_ftp_cd'} ); } # transfer files without transformations $ftp->binary(); return $ftp; } #----------------------------------------------------------------------- # close the ftp connection # parameters: ftp handle # return value: # 0 on error # 1 on success #----------------------------------------------------------------------- sub stopftp($) { my $ftp = ${ $_[0] }; INFO("Close FTP connection\n"); if ( defined $ftp ) { # close ftp connection DEBUG( 2, "close ftp connection\n" ); $ftp->quit(); return 1; } else { DEBUG( 2, "No ftp connection to close\n" ); return 0; } } #----------------------------------------------------------------------- # check all settings and create necessary files and directories # parameters: none # return value: none #----------------------------------------------------------------------- sub check_options() { # check if LocalFileDirectory is an directory and not an link or others if ( ( -e $param{'LocalFileDirectory'} ) && ( !-d $param{'LocalFileDirectory'} ) ) { ERROR( sprintf( "Local log directory \"%s\" exists, but\n" . "it is not a directory!\n" . "Please check this or set \"LocalFileDirectory\" to an real directory.\n", $param{'LocalFileDirectory'} ) ); EXIT( $returnvalue{'warning_localdirectory'} ); } # create LocalFileDirectory if ( !-d $param{'LocalFileDirectory'} ) { if ( !mkdir $param{'LocalFileDirectory'} ) { ERROR( sprintf( "Can't create LocalFileDirectory \"%s\": %s!\n" . "Please create this directory manually.\n", $param{'LocalFileDirectory'}, $! ) ); EXIT( $returnvalue{'error_createlocaldirectory'} ); } } if ( ( lc( $param{'PreProcessingProgram'} ) ne "not_set" ) && ( !-x $param{'PreProcessingProgram'} ) ) { ERROR( sprintf( "PreProcessingProgram \"%s\" not found or not executable!\n", $param{'PreProcessingProgram'} ) ); EXIT( $returnvalue{'warning_nopreprocessingprogram'} ); } # check main processing program if ( !-x $param{'ProcessingProgram'} ) { ERROR( sprintf( "No main processing program \"%s\"availably!\n", $param{'ProcessingProgram'} ) ); EXIT( $returnvalue{'warning_nomainprogram'} ); } if ( ( lc( $param{'PostProcessingProgram'} ne "not_set" ) ) && ( !-x $param{'PostProcessingProgram'} ) ) { ERROR( sprintf( "PostProcessingProgram \"%s\" not found or not executable!\n", $param{'PostProcessingProgram'} ) ); EXIT( $returnvalue{'warning_nopostprocessingprogram'} ); } # check DestinationLogFile if ( ( $param{'DestinationLogFile'} ne "none" ) && ( -e $param{'DestinationLogFile'} ) && ( !-f $param{'DestinationLogFile'} ) ) { ERROR( sprintf( "The final logfile \"%s\" exists, but it is not\n" . "a regular file! Please check this or set" . "\"DestinationLogFile\" to an real file.\n", $param{'DestinationLogFile'} ) ); EXIT( $returnvalue{'warning_destinationlogfile'} ); } # check permissions of the config file if ( ( ( stat( $param{'CONFIGFILE'} ) )[2] & 00077 ) > 0 ) { ERROR( sprintf( "The configuration file \"%s\" is group and/or world\n" . "readable and/or writeable! Please set remove the " . "permission for groups and world.\n", $param{'CONFIGFILE'} ) ); EXIT( $returnvalue{'warning_configfileperms'} ); } } #----------------------------------------------------------------------- # get command line options # parameters: none # return value: none #----------------------------------------------------------------------- sub get_options() { # temporary variables used by Getopt::Long() my $config_debug; my $config_file; my $config_little; my $config_man; my $config_section; my $config_template; my $config_usage; my $config_verbose; my $config_version; DEBUG( 1, "Process the commandline options\n" ); # get command line options GetOptions( 'configfile=s' => \$config_file, 'debug+' => \$config_debug, 'help|?' => \$config_usage, 'little' => \$config_little, 'man' => \$config_man, 'section=s' => \$config_section, 'template=s' => \$config_template, 'verbose' => \$config_verbose, 'version' => \$config_version ) or show_options( $returnvalue{'warning_wrongsyntax'} ); # set values $param{'CONFIGFILE'} = $config_file if ( defined $config_file ); $param{'DEBUGLEVEL'} = $config_debug if ( defined $config_debug ); $param{'LITTLEL'} = "yes" if ( defined $config_little ); $param{'SECTION'} = $config_section if ( defined $config_section ); $param{'ShowMessages'} = "yes" if ( defined $config_verbose ); # print the version and exit if ( defined $config_version ) { print "getweblogfiles.pl Version $param{'VERSION'}\n" if ( $param{'ShowMessages'} ne "yes" ); EXIT( $returnvalue{'successfull'} ); } # show the syntax show_options( $returnvalue{'successfull'} ) if ( defined $config_usage ); # show the documentation if ( defined $config_man ) { pod2usage( -exitval => $returnvalue{'successfull'}, -verbose => 2 ); } # create an configuration template write_template($config_template) if ( defined $config_template ); # check configuration files if ( !defined $param{'CONFIGFILE'} ) { if ( -f replace_wildcards("~/.getweblogfilesrc") ) { $param{'CONFIGFILE'} = replace_wildcards("~/.getweblogfilesrc"); } elsif ( -f replace_wildcards("~/.getweblogfiles") ) { $param{'CONFIGFILE'} = replace_wildcards("~/.getweblogfiles"); } elsif ( -f replace_wildcards("~/getweblogfiles.d/config") ) { $param{'CONFIGFILE'} = replace_wildcards("~/getweblogfiles.d/config"); } if ( ( !defined $param{'CONFIGFILE'} ) || ( !-f $param{'CONFIGFILE'} ) ) { ERROR( "Can't found a configuration file.\n" . "Please select one with the option --configfile.\n" ); EXIT( $returnvalue{'error_config_nofile'} ); } } } #----------------------------------------------------------------------- # shows the syntax and exits # parameter # optional: exit code # return value: none #----------------------------------------------------------------------- sub show_options ($) { my $returnvalue = $_[0]; print "Usage:\n" . " --configfile file select an specific configuration file\n" . " --debug increment the debug level by one\n" . " --help shows this text\n" . " --little don't write comments to the configuration template\n" . " --man show the internal man page\n" . " --section SectionName select the used section from the configuration file\n" . " --template file write a configuration file template\n" . " --verbose show program messages\n" . " --version show program name, version and end\n" . "\n" . "You can use pod2(html|man|latex|text) to convert the internal\n" . "documentation to various formats.\n" . "Example:\n" . " pod2latex getweblogfiles.pl - converts the documentation into a tex file.\n"; EXIT($returnvalue); } #----------------------------------------------------------------------- # extract the data from a CLF entry # parameter: one line in common log format # return value: normalized date of a CLF entry #----------------------------------------------------------------------- sub getDate ($) { my $line = $_[0]; my $datePos = index( $line, "[" ) + 1; my $normDate; $normDate = substr( $line, $datePos + 7, 4 ); $normDate .= $months{ substr( $line, $datePos + 3, 3 ) }; $normDate .= substr( $line, $datePos, 2 ); $normDate .= substr( $line, $datePos + 12, 2 ); $normDate .= substr( $line, $datePos + 15, 2 ); $normDate .= substr( $line, $datePos + 18, 2 ); return $normDate; } # main program INFO("getweblogfiles.pl Version $param{'VERSION'}\n"); # set initial variables $param{'FQDN'} = `hostname --fqdn`; chomp $param{'FQDN'}; # Query date and time ( $param{'SECOND'}, $param{'MINUTE'}, $param{'HOUR'}, $param{'DAY'}, $param{'MONTH'}, $param{'YEAR'} ) = ( localtime(time) )[ 0, 1, 2, 3, 4, 5 ]; $param{'YEAR'} += 1900; $param{'MONTH'}++; $param{'YEAR_SHORT'} = sprintf( "%02d", $param{'YEAR'} / 100 ); # print start informations DEBUG( 1, sprintf( "Time: %02d:%02d:%02d\n", $param{'HOUR'}, $param{'MINUTE'}, $param{'SECOND'} ) ); DEBUG( 1, sprintf( "Date: %02d.%02d.%d\n", $param{'DAY'}, $param{'MONTH'}, $param{'YEAR'} ) ); DEBUG( 1, sprintf( "Host: %s\n", $param{'FQDN'} ) ); # read commandline get_options(); # create temporary file ( $tempinput_handle, $tempinput_name ) = File::Temp->tempfile("tempXXXXXX"); ( $tempoutput_handle, $tempoutput_name ) = File::Temp->tempfile("tempXXXXXX"); # load settings load_config( $param{'CONFIGFILE'}, $param{'SECTION'} ); # check script settings, files and directories check_options; # load already processed logfiles load_logfiles( $param{'HashFile'} ); # set umask if ( ( $param{'Umask'} ne "none" ) && ( !umask( oct( $param{'Umask'} ) ) ) ) { WARNING("Can't set umask to " . $param{'Umask'} . " !\n" . "Please change the value in the configuration file." ); } INFO( "Start processing section \"" . $param{'SECTION'} . "\"\n" ); # ftp stuff { # connect to ftp server my $ftp = startftp( $param{'FTPServer'}, $param{'FTPUser'}, $param{'FTPPassword'}, $param{'FTPDirectory'} ); # get file list INFO("Get file list\n"); foreach my $entry ( File::Listing::parse_dir( $ftp->dir() ) ) { my ( $name, $type, $size, $date ) = (@$entry)[ 0, 1, 2, 3 ]; next if ( $type eq 'd' ); # check ExcludePattern if ( ( $param{'ExcludePattern'} ne "not_set" ) && ( $name =~ /$param{'ExcludePattern'}/ ) ) { DEBUG( 2, "Skip excluded file $name using pattern " . "\"$param{'ExcludePattern'}\"\n" ); next; } # add new files if ( ( !defined $oldlogfiles{$name} ) || ( $oldlogfiles{$name}->{'size'} != $size ) || ( $oldlogfiles{$name}->{'date'} != $date ) ) { # check IncludePattern if ( $param{'IncludePattern'} ne "not_set" ) { if ( $name !~ /$param{'IncludePattern'}/ ) { DEBUG( 2, "Skip not included file $name using pattern " . "\"$param{'IncludePattern'}\"\n" ); next; } else { DEBUG( 2, "Include file $name using pattern " . "\"$param{'IncludePattern'}\"\n" ); } } $newlogfiles{$name}->{'size'} = $size; $newlogfiles{$name}->{'date'} = $date; $newlogfiles{$name}->{'file'} = $param{'LocalFileDirectory'} . "/" . $name; } else { DEBUG( 2, "Skip processed file $name\n" ); } } # download new files INFO("Download new files\n"); foreach my $file ( keys %newlogfiles ) { my $local = $param{'LocalFileDirectory'} . "/" . $file; unlink($local) if ( ( -e $local ) && ( -f $local ) ); DEBUG( 2, "Download $file\n" ); $ftp->get( $file, $local ); if ( !-e $local ) { ERROR("Error during downloading $file - $local\n"); EXIT( $returnvalue{'error_ftp_getfailed'} ); } # calculate md5 open( FH, $local ); binmode(FH) or die "binmode() failed for $local"; $newlogfiles{$file}->{'md5'} = Digest::MD5->new->addfile(*FH)->hexdigest; close(FH); } stopftp( \$ftp ); } # remove old files with new date but known md5 sum { my $save = "no"; foreach my $file ( keys %newlogfiles ) { if ( ( defined $oldlogfiles{$file} ) && ( $oldlogfiles{$file}->{'md5'} eq $newlogfiles{$file}->{'md5'} ) ) { $oldlogfiles{$file}->{'date'} = $newlogfiles{$file}->{'date'}; delete( $newlogfiles{$file} ); DEBUG( 2, "Skip processed file $file - chksum known\n" ); $save = "yes"; } } if ( ( $save eq "yes" ) && ( ( scalar keys %newlogfiles ) == 0 ) ) { save_logfiles( $param{'HashFile'} ) if ( $save eq "yes" ); } } # no new files -> finish if ( ( scalar keys %newlogfiles ) == 0 ) { INFO("No new files found.\n"); EXIT( $returnvalue{'successfull'} ); } # disable caching $| = 1; # concatenate all files INFO("Uncompress and concatenate logfiles\n"); { my @sortedlogfiles = keys %newlogfiles; if ( $param{'LogFileSortOrder'} eq "ascending" ) { # sort by acending filenames @sortedlogfiles = sort @sortedlogfiles; } elsif ( $param{'LogFileSortOrder'} eq "descending" ) { # sort by descending filenames @sortedlogfiles = reverse sort @sortedlogfiles; } elsif ( $param{'LogFileSortOrder'} eq "timestamp" ) { # sort by timestamp @sortedlogfiles = sort { $newlogfiles{$b}->{'date'} <=> $newlogfiles{$a}->{'date'} } @sortedlogfiles; } foreach my $file ( sort @sortedlogfiles ) { if ( $file =~ /\.bz2$/ ) { # bzipped files unless ( open( BZCAT, "bzcat " . $newlogfiles{$file}->{'file'} . " |" ) ) { ERROR( "ERROR: during bzcat " . $newlogfiles{$file}->{'file'} . ": $!\n" ); EXIT(1); } print $tempoutput_handle <BZCAT>; close(BZCAT); } elsif ( $file =~ /\.gz$/ ) { # gzipped files unless ( open( ZCAT, "zcat " . $newlogfiles{$file}->{'file'} . " |" ) ) { ERROR( "ERROR: during zcat " . $newlogfiles{$file}->{'file'} . ": $!\n" ); EXIT(1); } print $tempoutput_handle <ZCAT>; close(ZCAT); } else { # uncompressed files cat( $newlogfiles{$file}->{'file'}, $tempoutput_handle ); } } } # copy the content from temp output to temp input copytempfiles(); # preprocessing files INFO("Preprocess logfiles\n"); if ( lc( $param{'PreProcessingProgram'} ) ne "not_set" ) { my $result = my_exec $param{'PreProcessingProgram'} . " " . $param{'PreProcessingOptions'} . " |"; if ( $result != 0 ) { ERROR( "During exec \"" . $param{'PreProcessingProgram'} . "\". Return value: $result\n" ); EXIT( $returnvalue{'error_preprocessingprogram'} ); } # copy the content from temp output to temp input copytempfiles() if ( $param{'PreProcessingSwapTempFiles'} eq "yes" ); } # sort logfiles entries by date and time if ( $param{'SortLogFile'} eq "yes" ) { my @unsortedLog; my @sortedLog; INFO("Sort logfile by date and time\n"); # load logfile seek( $tempinput_handle, 0, 0 ); @unsortedLog = <$tempinput_handle>; # sort logfile @sortedLog = sort { getDate($a) cmp getDate($b) } @unsortedLog; # write logfile back seek( $tempinput_handle, 0, 0 ); print $tempinput_handle @sortedLog; } # process files { my $result; INFO("Process logfiles\n"); $result = my_exec( $param{'ProcessingProgram'} . " " . $param{'ProcessingOptions'} . " |" ); if ( $result != 0 ) { ERROR( "During exec \"" . $param{'ProcessingProgram'} . "\". Return value: $result\n" ); EXIT( $returnvalue{'error_processingprogram'} ); } } # copy the content from temp output to temp input copytempfiles() if ( $param{'ProcessingSwapTempFiles'} eq "yes" ); # postprocess files INFO("Postprocess logfiles\n"); if ( lc( $param{'PostProcessingProgram'} ne "not_set" ) ) { my $result = my_exec $param{'PostProcessingProgram'} . " " . $param{'PostProcessingOptions'} . " |"; if ( $result != 0 ) { ERROR( "Error during exec \"" . $param{'PostProcessingProgram'} . "\". Return value: $result\n" ); EXIT( $returnvalue{'error_postprocessingprogram'} ); } # copy the content from temp output to temp input copytempfiles() if ( $param{'PostProcessingSwapTempFiles'} eq "yes" ); } # concatenate new log entrys to logfile if ( $param{'DestinationLogFile'} ne "none" ) { INFO( sprintf( "Concatenate new log entries to logfile \"%s\"\n", $param{'DestinationLogFile'} ) ); open( DESTLOG, ">>" . $param{'DestinationLogFile'} ); cat( $tempinput_name, \*DESTLOG ); close(DESTLOG); } # remove temporary file close($tempinput_handle); unlink($tempinput_name); $tempinput_handle = $tempinput_name = undef; close($tempoutput_handle); unlink($tempoutput_name); $tempoutput_handle = $tempoutput_name = undef; # add pocessed newlogfiles to oldlogfiles foreach my $file ( keys %newlogfiles ) { $oldlogfiles{$file}->{'size'} = $newlogfiles{$file}->{'size'}; $oldlogfiles{$file}->{'date'} = $newlogfiles{$file}->{'date'}; $oldlogfiles{$file}->{'md5'} = $newlogfiles{$file}->{'md5'}; } # save processed logfiles save_logfiles( $param{'HashFile'} ); # remove old logs from the ftp server if ( $param{'DeleteOldFTPLogs'} eq "yes" ) { INFO("Delete processed logfiles on the ftp server\n"); # connect to ftp server my $ftp = startftp( $param{'FTPServer'}, $param{'FTPUser'}, $param{'FTPPassword'}, $param{'FTPDirectory'} ); # delete processed logs on ftp server foreach my $file ( keys %newlogfiles ) { DEBUG( 2, "Delete on ftp: $file\n" ); WARNING("Can't delete $file on ftp: $!\n") if ( !$ftp->delete($file) ); } # close ftp connection stopftp( \$ftp ); } # remove old local logs if ( $param{'DeleteOldLocalLogs'} eq "yes" ) { INFO("Delete processed local logfiles\n"); foreach my $file ( keys %newlogfiles ) { DEBUG( 2, "Delete local file: $file\n" ); WARNING("Can't delete local file $file: $!") if ( !unlink( $newlogfiles{$file}->{'file'} ) ); } } # set file permissions if ( $param{'FilePermissions'}->[0] ne 'not_set' ) { INFO("Set file permissions\n"); foreach my $file ( @{ $param{'FilePermissions'} } ) { my ( $name, $owner, $perms, $recursive ) = split( /:/, $file ); my $chmod = "chmod "; my $chown = "chown "; # check configured permissions if ( ( !defined $recursive ) || ( lc($recursive) ne "recursive" ) ) { $recursive = "no"; } else { $recursive = "yes"; } if ( ( !defined $owner ) or ( $owner eq "" ) ) { $owner = "not_set"; } if ( ( !defined $perms ) or ( $perms eq "" ) ) { $perms = "not_set"; } if ( ( $owner eq "not_set" ) && ( $perms eq "not_set" ) ) { WARNING("Skip entry for $name, because owner and " . "permissions are not set.\n" ); next; } # set ownership if ( lc($owner) ne "not_set" ) { if ( $recursive eq "yes" ) { $chown .= "-R " } $chown .= $owner . " " . $name; my_exec $chown; } # set file permissions if ( lc($perms) ne "not_set" ) { if ( $recursive eq "yes" ) { $chmod .= "-R " } $chmod .= $perms . " " . $name; my_exec $chmod; } } } # empty hash %newlogfiles = (); INFO("Script finished successfull\n"); EXIT( $returnvalue{'successfull'} ); #----------------------------------------------------------------------- # END OF SCRIPT #----------------------------------------------------------------------- __END__