Difference between revisions of "Mu scan datetime"

From Mailutils
Jump to navigationJump to search
(Created page with "{{DISPLAYTITLE:mu_scan_datetime}} <syntaxhighlight lang="C"> #include <mailutils/datetime.h> int mu_scan_datetime (const char *input, const char *format, struct tm *tm, ...")
 
(Document changes in endp semantics.)
 
(2 intermediate revisions by the same user not shown)
Line 7: Line 7:
 
</syntaxhighlight>
 
</syntaxhighlight>
  
The function <tt>mu_scan_datetime</tt> parses the string <tt>input</tt>, which is supposed to contain a date/time information, according to the <tt>format</tt> and places the resulting broken-down time in the <tt>tm</tt> variable. Unless <tt>tz</tt> is <tt>NULL</tt> the time zone information is stored there.  Unless <tt>endp</tt> is <tt>NULL</tt>, <tt>mu_scan_datetime</tt> stores the address of the character where parsing stopped in <tt>endp</tt>.
+
The function <tt>mu_scan_datetime</tt> scans the string <tt>input</tt>, which is supposed to contain a date/time information, according to the <tt>format</tt> and places the resulting broken-down time in the <tt>tm</tt> variable. Unless <tt>tz</tt> is <tt>NULL</tt> the time zone information is stored there.
  
The function returns 0 on success, <tt>MU_ERR_PARSE</tt> if <tt>input</tt> does not satisfy <tt>format</tt>, and <tt>MU_ERR_FORMAT</tt> if <tt>format</tt> itself is erroneous.  In the latter case, a detailed information about the error is output to the [[Debug level|debugging channel]] <tt>mailbox.error</tt>.
+
On success, the function returns 0.  If the input does not satisfy the format, <tt>MU_ERR_PARSE</tt> is returned.  In both cases, unless <tt>endp</tt> is <tt>NULL</tt>, it will be initialized with the address of the character in <tt>input</tt> where the scanning stopped.
 +
 
 +
If <tt>format</tt> itself is erroneous, <tt>MU_ERR_FORMAT</tt> is returnedThe address of the character '''in fmt''' at which the error was detected will be stored in the memory location pointed to by <tt>endp</tt>,
 +
unless it is <tt>NULL</tt><ref>This change was introduced by commit [http://git.savannah.gnu.org/cgit/mailutils.git/commit/?id=df29254df82d4aa466f066dd76e00446e540e1cb df29254df8].  Prior to this, <tt>endp</tt> always received the address of the character in <tt>input</tt> where the conversion stopped.</ref>.  Detailed information about the error is output to the [[Debug level|debugging channel]] <tt>mailbox.error</tt>.
  
 
The <tt>format</tt> follows the [[mu_c_streamftime#Format_string|mu_c_streamftime format specification]] with several extensions.  It is interpreted as follows:
 
The <tt>format</tt> follows the [[mu_c_streamftime#Format_string|mu_c_streamftime format specification]] with several extensions.  It is interpreted as follows:
Line 15: Line 18:
 
* A sequence of white-space characters (space, tab, newline, etc.; see [[mu_isspace]]) matches any amount of white space, including none, in the input.
 
* A sequence of white-space characters (space, tab, newline, etc.; see [[mu_isspace]]) matches any amount of white space, including none, in the input.
 
* An ordinary character (i.e., one other than white space or <tt>%</tt>) must exactly match the next character of input.
 
* An ordinary character (i.e., one other than white space or <tt>%</tt>) must exactly match the next character of input.
* A [[mu_c_streamftime#Format_string|conversion specifier]] parses next characters according to its definition and stores the result in the corresponding location in <tt>tm</tt> or <tt>tz</tt> (unless the latter is <tt>NULL</tt>.
+
* A [[mu_c_streamftime#Format_string|conversion specifier]] parses next characters according to its definition and stores the result in the corresponding location in <tt>tm</tt> or <tt>tz</tt> (unless the latter is <tt>NULL</tt>).
 
* A '''flow control''' conversion alters the way the format is interpreted.
 
* A '''flow control''' conversion alters the way the format is interpreted.
  

Latest revision as of 11:33, 27 November 2020

#include <mailutils/datetime.h>

int mu_scan_datetime (const char *input, const char *format, struct tm *tm,
                      struct mu_timezone *tz, char **endp);

The function mu_scan_datetime scans the string input, which is supposed to contain a date/time information, according to the format and places the resulting broken-down time in the tm variable. Unless tz is NULL the time zone information is stored there.

On success, the function returns 0. If the input does not satisfy the format, MU_ERR_PARSE is returned. In both cases, unless endp is NULL, it will be initialized with the address of the character in input where the scanning stopped.

If format itself is erroneous, MU_ERR_FORMAT is returned. The address of the character in fmt at which the error was detected will be stored in the memory location pointed to by endp, unless it is NULL[1]. Detailed information about the error is output to the debugging channel mailbox.error.

The format follows the mu_c_streamftime format specification with several extensions. It is interpreted as follows:

  • A sequence of white-space characters (space, tab, newline, etc.; see mu_isspace) matches any amount of white space, including none, in the input.
  • An ordinary character (i.e., one other than white space or %) must exactly match the next character of input.
  • A conversion specifier parses next characters according to its definition and stores the result in the corresponding location in tm or tz (unless the latter is NULL).
  • A flow control conversion alters the way the format is interpreted.

Flow control specifiers are:

%?
This specifier matches any single character on input.
%\C
(where C is any single character) This specifier matches character C exactly. It is useful for specifying mandatory whitespace, as in:
 "%d-%b-%Y%\ %H:%M:%S %z"

The above format requires date and time parts to be separated by a single space character.

%$
This specifier means optional end of input. That is, if the input does not match conversion past this specifier, it is not considered an error. Instead, the endp is set to the position where scanning stopped and 0 is returned. The tm and tz variables are left with the partial information collected so far.

Consider, for example, date format used in IMAP SEARCH command. It is basically described as follows:

 "%d-%b-%Y %H:%M:%S %z"

However, the time and timezone information is optional. To allow for that, the following format can be used instead:

 "%d-%b-%Y%$ %H:%M:%S %z"

(see the MU_DATETIME_INTERNALDATE in mailutils/datetime.h[2]).

%[ %| %]
The part of format enclosed between %[ and %] specifiers is optional. The specifiers can be nested to any depth. The %| specifier appearing within the optional group introduces an optional alternative. Any number of alternatives may be present.

Let's return to the above format for an example. It can be rewritten using optional group as follows:

 "%d-%b-%Y%[ %H:%M:%S %z%]"

As a more complex example, consider RFC-822 date format. It has two optional parts: the day of week at the beginning and seconds at the end. The corresponding scan format is:

 "%[%a, %]%e %b %Y %H:%M%[:%S%] %z"

See the MU_DATETIME_SCAN_RFC822 define in mailutils/datetime.h.

The following example illustrates nested optional groups:

 %[%a%[,%] %]%d %b %Y %H:%M:%S %z

It matches any of the following date specifications:

 Tue, 03 May 2011 13:25:26 +0200
 Tue 03 May 2011 13:25:26 +0200
 03 May 2011 13:25:26 +0200
%( %| %)
Alternatives. This specifier requires that the input matches one of the alternatives separated by %|. Two or more alternatives may be specified, but at least two must be present.

For example the following format

 "%a%(,%|:%|/%) %d %b %Y %H:%M:%S %z"

requires the day of time (%a) to be followed by a single comma, semicolon or slash. Any other character following it will cause the format to fail.

Optional blocks and alternative specifiers can be nested.

Notes

  1. This change was introduced by commit df29254df8. Prior to this, endp always received the address of the character in input where the conversion stopped.
  2. http://git.gnu.org.ua/cgit/mailutils.git/tree/include/mailutils/datetime.h

See also

mu_c_streamftime