Raising Events with Files
File events are special events that are linked to a directory on the host that a Platform Agent runs on. If they are defined, the Platform Agent will monitor the file system and wait for a file to arrive that matches the file pattern. File events are supported by all UNIX, Windows, and OpenVMS Platform Agents.
Several strategies are available to efficiently and reliably detect files.
-
A reliable detection will detect file changes exactly once, so not too often, but without missing any changes.
-
Detection should not be resource intensive.
-
When a process looks for the file it should be present. In other words, the file system detection event should not detect a file that is not present when a process runs and checks for the file.
The most efficient and reliable way to detect file changes and convert these into unique file events is to detect local files only, with a move directory (so that all files result in exactly one event) and file age checks to prevent expensive lock tests.
FileEventCredential Process Server Parameter
The FileEventCredential Process Server parameter lets you specify a user as which RunMyJobs can detect files and, optionally, move files.
Syntax
[<NTLM DOMAIN>\]<real_user>@<domain> - uses credential `login-<real_user>-<domain>`
<NTLM DOMAIN>\<user>/<password> - hardcoded password (not recommended)
<real_user>/<password>@<domain> - hardcoded password (not recommended)
<NTLM DOMAIN>
: Legacy NTLM domain name. For example:MYDOMAIN
.<real_user>
: Windows username as specified in the Real User of the credential. For example:jdoe
.<domain>
: ActiveDirectory domain, also known as Kerberos realm or DNS domain name. For example:mydomain.example.local
.
Networked File Systems
Redwood recommends that you use file events only for local files, not for networked files on a NFS or SMB share. This is because the remote file system will always introduce some unreliability, and this may cause spurious or missed file events. If you detect files on networked file systems, make sure the processes that depend on the file system are reliable, in that they do not cause errors when a file is suddenly missing because of caching effects or file server reboots.
If you want networked files to be detected on Windows, you must specify a valid credential in the FileEventCredential Process Server parameter. Note that the user defined in the credential must at least have permissions to read the file, and if a move directory is involved, the user must have permissions to move files into it (in other words, the user must have read and delete privileges on the source file and create privileges in the target directory). The Process Server parameter must be set, because the Platform Agent is running as NT Authority\\System
, andthis account has no access to network shares or mapped drives.
Networked file events are not supported on UNIX/HP OpenVMS.
Files located on a SAN are considered local files. NFS or SMB mounted files (NAS) are considered networked files.
File events are provided by the Platform Agent, which requires a ProcessServerService.OS.limit
license key for UNIX and Microsoft Windows Platform Agents, and a ProcessServerService.VMS.limit
license key for OpenVMS Platform Agents.
The file event definition sets the following things:
- The directory where the file(s) are detected.
- The file name or name pattern.
- The minimum file size.
- The minimum age since last modification.
- The lock flag. If this is set, the system verifies that the file is not locked or open by any process.
- An optional move directory further ensures that files are detected only exactly once.
- An overwrite flag. If this is set, detected files will overwrite existing files in the move directory.
- The poll interval (how often the file event is checked).
- A raise comment (allows a custom comment to accompany the raised event).
Detection
If you do not specify a move directory, files are left where they are when detected. If their timestamp is changed, or they are deleted and then recreated, they will be detected again. If you specify a wildcard, an event will be raised for every file that changes or is created. To make this work reliably, the Platform Agent keeps a state file that contains the timestamp for every file.
The OpenVMS file system supports versioning of files. To ensure reliable detection, all versions of a file will be detected, even if you specify a specific version in the filename attribute.
Wildcards
You cannot have wildcard characters in the directory path, but you can have wildcards and glob matching in the filename.
Character Classes
An expression [...]
where the first character after the leading [
is not an !
matches a single character, namely any of the characters enclosed by the brackets. The string enclosed by the brackets cannot be empty; therefore ]
can be allowed between the brackets, provided that it is the first character. (Thus, [][!]
matches the three characters ]
, [
and !
)
Ranges
There is one special convention inside a range: two characters separated by -
denote a range. (Thus, [A-Fa-f0-9]
is equivalent to [ABCDEFabcdef0123456789]
)
One may include -
in its literal meaning by making it the first or last character between the brackets. (Thus, [*--]
matches *
, +
, ,
, and -
)
Complementation
An expression [!...]
matches a single character, namely any character that is not matched by the expression obtained by removing the first !
from it. (Thus, [!]ab]
matches any single character except ]
, a
and b
)
One can remove the special meaning of ?
, *
and [
by preceding them by a backslash.
Between brackets these characters stand for themselves. Thus, [[?*\\]
matches [
, ?
, *
, and \
.
Subdirectories
If you want to detect files in subdirectories, you can use */<pattern>
to match any subdirectory at the first level or **/<pattern>
to match any level of directory. You can use <pattern1>/<pattern2>
to use glob-style matching for both directory and file names.
Note: As of version 9.0.6, Platform Agents use their own glob parsers. If you need the old platform-specific parser as used in version 9.0.5 and before, you can set JCS_OLD_GLOB_STYLE
to any value in the Platform Agent environment.
Minimum Age
The minimum age is a time-based method to ensure that each file is complete.
- If you set the minimum age to a positive integer, the Platform Agent will wait the specified number of seconds from the last modification time.
- If you set the minimum age to a negative integer, the Platform Agent will ensure the timestamp of the file does not change for the specified absolute value in seconds, disregarding the last modification time offset.
The field accepts values in the range ( -86400
to 86400
, +/-24
hours).
Example: the Platform Agent is running on a system where the current time is 9:30 AM, and a file arrives with a timestamp of 11:30 AM.
- If the Minimum Age field is set to
60
, the event will be fired at 11:31 AM. - If the Minimum Age field is set to
-60
, the event will be fired at 9:31 AM.
Another example:the Platform Agent is running on a system where the current time is 9:30 AM, and a file arrives with a timestamp of 9:30 AM.
- If the Minimum Age field is set to
60
, the event will be fired at 9:31 AM. - If the Minimum Age field is set to
-60
, the event will be fired at 9:31 AM.
On UNIX systems, Redwood recommends using a non-zero minimum age instead of a lock test, because it is as more efficient use of computing resources.
Lock Detection
The Platform Agent can check for locks on files if the lock attribute is set.
On Windows and OpenVMS, this is done by checking that there are no outstanding file locks. This works quickly and reliably on local files.
On UNIX, checking for locks would not work, because UNIX does not lock files that are open for read or write. Therefore, RunMyJobs uses fuser
or an equivalent program to check in the kernel whether any process has the inode of the file open. This has two potential issues: performance and reliability. The performance of the lock test program depends on how the UNIX system stores open file descriptors. Some operating systems must enumerate all open files and take several seconds to find out whether a file is locked. Also, if the file is stored on an NFS file system, the lock test will not reveal any locks held by other systems.
Instead of the lock test, Redwood recommends using the minimum file size and minimum file age test.
Changing the Lock Test on UNIX
You can set the command used to perform the file-is-locked check for UNIX Platform Agents using the FileInUseCommand Process Server parameter. Its default value is:
Operating System | Command |
---|---|
Linux | lsof "${Filename} " |
UNIX | /usr/sbin/fuser "${Filename} " |
OpenVMS | N/A |
Microsoft Windows | N/A |
You can use your own command, but it should behave like either fuser
or lsof
. The rules that the Platform Agent uses to parse the output are as follows.
- For
fuser
, if it finds at least one line of the format^${file}:[ ]*[0-9]+
, the file is locked. Note that thefuser
implementation generally either is silent on an unlocked file (for example, on Linux), or produces (for example, on AIX 5)^${file}:[ ]*
, so the test on digits after the whitespace is necessary. - For
lsof
, if the result value is error and the output is empty, the file is not locked. If the result value is OK (and there is output), the file is locked.
The system determines whether to check the lsof
type output by scanning the FileInUseCommand
for the string lsof
.
Move Directory
If your filename pattern matches many files, it is more efficient to use a move directory. This allows the system to move detected files to a new directory, keeping the directory that it needs to watch smaller and thus more efficient. This is also helpful at the OS level: if a file is still in the original directory, it has not been detected yet, and if it is in the move directory, the system has seen the file and the event has been raised (or is in the process of being raised). If the filenames that are detected are not unique over time, set the overwrite flag, or delete the detected file as part of the event's actions. By default, the system will not overwrite the file, because the content may be needed by the system. If your files contain data, keep the overwrite file checkbox cleared. The Platform Agent will then ensure that no data is lost.
The move directory field value can be:
- A full path, including a directory and a filename.
- Just a directory name, in which case the filename will remain the same.
- Just a filename, in which case the file will remain in the same directory with the new filename. Make sure the new filename does not match the file event filename patter, or the file will be detected with the new name in the next poll. Using this option is not as efficient as a move directory that contains a directory.
The directory path that the move directory contains can be a subdirectory of the original directory (unless the filename specifies subdirectories).
In order to be efficient and reliable, the move directory should be on the same file system as the detection directory. Do not use a move directory on a different file system, even if this is not explicitly forbidden by the system. Some operating systems, on some file systems, transparently convert a move operation into copy and delete operations. This will result in occasional failures and is not recommended.
Warning: The directory specified by the move directory path must already exist.
The OpenVMS file system supports versioning of files. To ensure that no files are overwritten, the file move performed by the Platform Agent will not modify the file version of the moved file, and it will not create new file version. If you put a file in the detection directory with a version number that already exists in the move directory, the file will not be moved and an error will be raised (unless you have set the Overwrite flag).
File Event Move Directory
All DateTime
variables use the format yyyyMMddHHmmss
. You can specify a Java DateTimeFormatter pattern using the ${FileDateTime:HHmmss}
syntax.
${BaseDirectory}
: The path of the directory containing the detected file.${BaseName}
:The base filename of the detected file.${Name}
: The filename of the detected file.${Dot}
: A dot (.
), such as that found in filenames before the extension.${Extension}
: The extension of the detected file.${CurrentDateTime}
: The Platform Agent date and time when the file was detected.${CurrentTimeStamp}
: The Platform Agent timestamp when the file was detected, in a numeric format containing the milliseconds since 1970, usually referred to as epoch or UNIX time.${DateTime}
: Deprecated in favor ofCurrentDateTime
.${TimeStamp}
: Deprecated in favor ofCurrentTimeStamp
.${FileDateTime}
: The file's modification date and time.${FileTimeStamp}
: The file's modification time, in epoch time.${fileName}
: The filename (full path of file) of the detected file.${UniqueId}
: The unique ID of the file event.
Raiser Comments
File Event Raiser Comments
Events can also be raised by files on servers with a Platform Agent or on AS/400 systems.
The following substitution parameters are available for raiser comments of file events:
All DateTime
variables use the format yyyyMMddHHmmss
. You can specify a Java DateTimeFormatter pattern using the ${FileDateTime:HHmmss}
syntax.
${CurrentDateTime}
: The Platform Agent date and time when the file was detected.${CurrentTimeStamp}
: The Platform Agent timestamp when the file was detected, in a numeric format containing the milliseconds since 1970, usually referred to as epoch or UNIX time.${FileDateTime}
: The file's modification date and time.${FileTimeStamp}
: The file's modification time in epoch time.${ServerDateTime}
: The RunMyJobs server date and time when the event was raised.${ServerTimeStamp}
: The RunMyJobs server time when the event was raised, in epoch time.${processServer
: The name of the Process Server.${server}
: The name of the Platform Agent.${filename}
: The path of the detected file (before any move).${finalPath}
: The new path of the detected file (after any move).
The default file event raiser comment is the following:
File event raised by "${filename}" on "${server}"
The following topics cover file events in more details.
Error Handling
When a Platform Agent encounters a problem with a file that it has detected, the following things happen.
- A message is logged in the network processor log file.
- An Operator Message attached to the file event definition is sent to the server.
- The file event scan interval is set to a large value. After the first error, it is set to one hour. If the value is already one hour or longer, it is doubled until the scan interval is once per day. If the next attempt succeeds, the scan interval is reset to the original value.
The scan interval is increased abecause repeating the attempt after the original scan interval would generate huge numbers of error messages.
If you saw file event error messages in the Operator Message console and have repaired the problem, reset the Platform Agent event timing by stopping and then starting the Process Server. The error messages that cause this are accompanied by a message stating until when the Platform Agent will refrain from checking the files for this file event. The current list of possible errors includes:
- Could not check whether file is in use: {command}.
- File event {name} not raised because the file could not be renamed from {source} to {destination}.
- File event {name} not raised for detected file {source} because {directory} is not an existing directory.
- File event {name} not raised for detected file {source} because the variables in {moveDirectory} could not be substituted.
- File event {name} not raised as the target move file name could not be built: Components {moveDirectory} and {targetFile}.
Note: Different file events have different scan intervals. When one file event definition has an error on a particular system, it does not cause any other file events defined on the same Process Server to be delayed.
Creating a File Event Definition
To create a file event definition:
- Navigate to Definitions > Event Definitions.
- Click .
- Configure the event using the table below.
- Click the File Event Definition tab and then click New.
- Configure the event using the table below.
- Click Save &Close.
Event Definition
Sample Configuration Data
The following is sample configuration data for a file event definition.
Event Definition Tab
Name: DataWareHouseLoaded
Description: Data Warehouse loaded
Comment: The data warehouse has been loaded. Reports that require data from the data warehouse should wait for this event to be raised.
File Event Definition Tab
Name: DataWareHouseLoaded
Description: Data Warehouse loaded
Comment: The data warehouse has been loaded. Reports that require data from the data warehouse should wait for this event to be raised.
Enabled: <true>
Directory: C:\tmp\
Pattern: done??.txt
Move Directory: C:\done\
Check Lock: <true>
Process Server: WIN-SERV-03