Difference between revisions of "Mu scan datetime"

From Mailutils
Jump to navigationJump to search
Line 15: Line 15:
 
* A sequence of white-space characters (space, tab, newline, etc.; see [[mu_isspace]]) matches any amount of white space, including none, in the input.
 
* A sequence of white-space characters (space, tab, newline, etc.; see [[mu_isspace]]) matches any amount of white space, including none, in the input.
 
* An ordinary character (i.e., one other than white space or <tt>%</tt>) must exactly match the next character of input.
 
* An ordinary character (i.e., one other than white space or <tt>%</tt>) must exactly match the next character of input.
* A [[mu_c_streamftime#Format_string|conversion specifier]] parses next characters according to its definition and stores the result in the corresponding location in <tt>tm</tt> or <tt>tz</tt> (unless the latter is <tt>NULL</tt>.
+
* A [[mu_c_streamftime#Format_string|conversion specifier]] parses next characters according to its definition and stores the result in the corresponding location in <tt>tm</tt> or <tt>tz</tt> (unless the latter is <tt>NULL</tt>).
 
* A '''flow control''' conversion alters the way the format is interpreted.
 
* A '''flow control''' conversion alters the way the format is interpreted.
  

Revision as of 18:52, 7 July 2014

#include <mailutils/datetime.h>

int mu_scan_datetime (const char *input, const char *format, struct tm *tm,
                      struct mu_timezone *tz, char **endp);

The function mu_scan_datetime scans the string input, which is supposed to contain a date/time information, according to the format and places the resulting broken-down time in the tm variable. Unless tz is NULL the time zone information is stored there. Unless endp is NULL, mu_scan_datetime stores the address of the character where scanning stopped in endp.

The function returns 0 on success, MU_ERR_PARSE if input does not satisfy format, and MU_ERR_FORMAT if format itself is erroneous. In the latter case, a detailed information about the error is output to the debugging channel mailbox.error.

The format follows the mu_c_streamftime format specification with several extensions. It is interpreted as follows:

  • A sequence of white-space characters (space, tab, newline, etc.; see mu_isspace) matches any amount of white space, including none, in the input.
  • An ordinary character (i.e., one other than white space or %) must exactly match the next character of input.
  • A conversion specifier parses next characters according to its definition and stores the result in the corresponding location in tm or tz (unless the latter is NULL).
  • A flow control conversion alters the way the format is interpreted.

Flow control specifiers are:

%?
This specifier matches any single character on input.
%\C
(where C is any single character) This specifier matches character C exactly. It is useful for specifying mandatory whitespace, as in:
 "%d-%b-%Y%\ %H:%M:%S %z"

The above format requires date and time parts to be separated by a single space character.

%$
This specifier means optional end of input. That is, if the input does not match conversion past this specifier, it is not considered an error. Instead, the endp is set to the position where scanning stopped and 0 is returned. The tm and tz variables are left with the partial information collected so far.

Consider, for example, date format used in IMAP SEARCH command. It is basically described as follows:

 "%d-%b-%Y %H:%M:%S %z"

However, the time and timezone information is optional. To allow for that, the following format can be used instead:

 "%d-%b-%Y%$ %H:%M:%S %z"

(see the MU_DATETIME_INTERNALDATE in mailutils/datetime.h[1]).

%[ %| %]
The part of format enclosed between %[ and %] specifiers is optional. The specifiers can be nested to any depth. The %| specifier appearing within the optional group introduces an optional alternative. Any number of alternatives may be present.

Let's return to the above format for an example. It can be rewritten using optional group as follows:

 "%d-%b-%Y%[ %H:%M:%S %z%]"

As a more complex example, consider RFC-822 date format. It has two optional parts: the day of week at the beginning and seconds at the end. The corresponding scan format is:

 "%[%a, %]%e %b %Y %H:%M%[:%S%] %z"

See the MU_DATETIME_SCAN_RFC822 define in mailutils/datetime.h.

The following example illustrates nested optional groups:

 %[%a%[,%] %]%d %b %Y %H:%M:%S %z

It matches any of the following date specifications:

 Tue, 03 May 2011 13:25:26 +0200
 Tue 03 May 2011 13:25:26 +0200
 03 May 2011 13:25:26 +0200
%( %| %)
Alternatives. This specifier requires that the input matches one of the alternatives separated by %|. Two or more alternatives may be specified, but at least two must be present.

For example the following format

 "%a%(,%|:%|/%) %d %b %Y %H:%M:%S %z"

requires the day of time (%a) to be followed by a single comma, semicolon or slash. Any other character following it will cause the format to fail.

Optional blocks and alternative specifiers can be nested.

Notes

See also

mu_c_streamftime