Seeing you discussing this aspect, make some contributions to show your selfless dedication to this edition: alas:
If users are familiar with sed, awk, grep or vi under Linux, then the concept of regular expression will certainly be familiar to them. Because it can greatly simplify the complexity of handling strings.
Degree, so now it has been applied in many Linux utilities. Never think that regular expressions are just patents of scripting languages such as Perl, Python and Bash. As a c language program.
Members, users can also use regular expressions in their own programs.
Standard C and C++ do not support regular expressions, but there are some function libraries that can assist C/C++ programmers to complete this function. The most famous is Philip Hazel's Perl-compatible regular expression library, which is included in many Linux distributions.
Compile regular expression
In order to improve efficiency, before comparing a string with regular expressions, it should be compiled with regcomp () function and converted into regex_t structure:
int regcomp(regex_t *preg,const char *regex,int cflags);
The parameter regex is a string representing the regular expression to be compiled; The parameter preg points to a data structure declared as regex_t, which is used to save the compilation results; The parameter cflags determines how the details of regular expressions should be handled.
If the function regcomp () is successfully executed, and the compilation result is correctly filled in the preg, the function will return 0, and any other return result represents some kind of error.
Matching regular expression
Once the regcomp () function successfully compiles the regular expression, you can call the regexec () function to complete the pattern matching:
int regexec(const regex_t *preg,const char *string,size_t nmatch,regmatch_t pmatch[],int e flags);
Typedef structure {
regoff _ t rm _ so
regoff _ t rm _ eo
} regmatch _ t;
Parameter preg points to the compiled regular expression, parameter string is the string to be matched, parameters nmatch and pmatch are used to return the matching result to the calling program, and the last parameter eflags determines the matching details.
In the process of pattern matching by calling the function regexec (), there may be multiple matches for a given regular expression in a string, which is guaranteed by using the parameter pmatch.
To store these matching positions, the parameter nmatch tells the function regexec () how many matching results can be filled into the pmatch array at most. When the regexec () function returns successfully
Back,from string+pmatch[0]。 Rm_so to string+pmatch[0]. Rm_eo is the first matching string from
String+pmatch[ 1]。 Rm_so to string+pmatch[ 1]. Rm_eo is the second matching string, and so on.
Release regular expression
Whenever the compiled regular expression is no longer needed, the function regfree () should be called to release it to avoid memory leakage.
void regfree(regex _ t * preg);
The function regfree () returns no results. It only receives a pointer to the regex_t data type, which is the compilation result obtained by calling the regcomp () function before.
If the regcomp () function is called many times for the same regex_t structure in the program, POSIX standard does not stipulate whether the regfree () function must be called every time.
However, it is suggested that the regfree () function should be called every time the regcomp () function is called to compile regular expressions, so as to release the occupied storage space as soon as possible.
Report an error message
If the function regcomp () or regexec () is called to get a non-zero return value, it means that some kind of error has occurred in the process of processing regular expressions. At this time, you can get detailed error information by calling the function regerror ().
size_t regerror(int errcode,const regex_t *preg,char *errbuf,size _ t errbuf _ size);
Parameter errcode is the error code from function regcomp () or regexec (), and parameter preg is the compilation result from function regcomp ().
Its purpose is to provide the regerror () function with the context needed to format the message. When the function regerror () is executed, the maximum number of characters indicated by the parameter errbuf_size will be followed.
Segment number, fill in the formatted error message in the errbuf buffer and return the length of the error message.
Apply regular expressions
Finally, through a concrete example, this paper introduces how to deal with regular expressions in C language programs.
# include & ltstdio.h & gt;
# include & ltsys/types . h & gt; ;
# include & ltregex.h & gt;
/* Functions that accept substrings */
Static char* substr(const char*str, unsigned start, unsigned end)
{
Unsigned n = end-start;
Static charstbuf [256];
strncpy(stbuf,str + start,n);
ST buf[n]= 0;
Return to stbuf
}
/* main program */
int main(int argc,char** argv)
{
Char * mode;
int x,z,lno = 0,cflags = 0;
char ebuf[ 128],lbuf[256];
regex _ t reg
reg match _ t pm[ 10];
const size _ t n match = 10;
/* Compile regular expressions */
pattern = argv[ 1];
z = regcomp(& amp; reg,pattern,cflags);
If (z! = 0){
regerror(z & amp; reg,ebuf,sizeof(ebuf));
fprintf(stderr," %s: pattern '%s' \n ",ebuf,pattern);
Returns1;
}
/* Process input data line by line */
while(fgets(lbuf,sizeof(lbuf),stdin)) {
++ lno;
if((z = strlen(lbuf))& gt; ; 0 & amp& amplbuf[z- 1] == '\n ' '
lbuf[z- 1]= 0;
/* Apply a regular expression to each line to match */
z = regexec(& amp; reg,lbuf,nmatch,pm,0);
If (z == REG_NOMATCH) continues;
else if (z! = 0) {
regerror(z & amp; reg,ebuf,sizeof(ebuf));
fprintf(stderr," %s: regcom('%s')\n ",ebuf,lbuf);
return 2;
}
/* Output the processing result */
for(x = 0; X< does not match. & amppm[x]。 rm_so! = - 1; ++ x) {
If (! x) printf("%04d: %s\n ",lno,lbuf);
printf(" $%d='%s'\n ",x,substr(lbuf,pm[x])。 rm_so,pm[x]。 RM _ EO));
}
}
/* Release regular expression */
regfree(& amp; reg);
Returns 0;
}
The above program is responsible for obtaining regular expressions from the command line, then applying them to each line of data obtained from standard input, and printing out the matching results. Execute the following command to compile and execute the program:
# gcc regular expression. c -o regular expression
# ./regexp ' regex[a-z]* ' & lt; Regular expression c
0003:# include & lt; regex.h & gt;
$0='regex '
0027:regex _ t reg;
$0='regex '
0054:z = regexec(& amp; reg,lbuf,nmatch,pm,0);
$0='regexec '
summary
Regular expression is undoubtedly a very useful tool for those programs that need complex data processing. This paper focuses on how to use regular expressions in C language to simplify string processing, so as to gain flexibility similar to Perl language in data processing.