欄目導航

新聞資訊

新聞資訊

ex是Lexical Analyzer Generator（取前三個字母）的縮寫，是Unix環境下非常著名的工具，主要功能是生成一個詞法分析器(scanner)的C源碼，描述規則采用正則表達式(regular expression)。

Lex已經廣泛地用于描述各種語言的詞法分析器。

flex (the fast lexical analyser)意思是快速詞法分析器。

Win-flex bison是flex和bison在Windows平臺的一個移植版本，它支持flex（快速詞法分析器）和bison（GNU解析器生成器）。

Win-flex bison的下載網址： https://sourceforge.net/projects/winflexbison/

點擊“Download”按鈕，開始下載文件“win_flex_bison-latest.zip”，文件大小僅有692KB。

解壓到自己喜歡的位置。

你可以在命令行直接使用win_flex和win_bison，或者在Visual Studio中借助CustomBuildRules使用它們（詳見 https://sourceforge.net/p/winflexbison/wiki/Visual%20Studio%20custom%20build%20rules/ ）

flex/bison文件的例子可參看網頁 https://sourceforge.net/projects/winflexbison/files/

在命令行輸入 win_flex --help ，可以獲得相關用法：

Usage: win_flex [OPTIONS] [FILE]...

Generates programs that perform pattern-matching on text.

Table Compression:

-Ca, --align trade off larger tables for better memory alignment

-Ce, --ecs construct equivalence classes

-Cf do not compress tables; use -f representation

-CF do not compress tables; use -F representation

-Cm, --meta-ecs construct meta-equivalence classes

-Cr, --read use read() instead of stdio for scanner input

-f, --full generate fast, large scanner. Same as -Cfr

-F, --fast use alternate table representation. Same as -CFr

-Cem default compression (same as --ecs --meta-ecs)

Debugging:

-d, --debug enable debug mode in scanner

-b, --backup write backing-up information to lex.backup

-p, --perf-report write performance report to stderr

-s, --nodefault suppress default rule to ECHO unmatched text

-T, --trace win_flex should run in trace mode

-w, --nowarn do not generate warnings

-v, --verbose write summary of scanner statistics to stdout

--hex use hexadecimal numbers instead of octal in debug outputs

Files:

-o, --outfile=FILE specify output filename

-S, --skel=FILE specify skeleton file

-t, --stdout write scanner on stdout instead of lex.yy.c

--yyclass=NAME name of C++ class

--header-file=FILE create a C header file in addition to the scanner

--tables-file[=FILE] write tables to FILE

Scanner behavior:

-7, --7bit generate 7-bit scanner

-8, --8bit generate 8-bit scanner

-B, --batch generate batch scanner (opposite of -I)

-i, --case-insensitive ignore case in patterns

-l, --lex-compat maximal compatibility with original lex

-X, --posix-compat maximal compatibility with POSIX lex

-I, --interactive generate interactive scanner (opposite of -B)

--yylineno track line count in yylineno

Generated code:

-+, --c++ generate C++ scanner class

-Dmacro[=defn] #define macro defn (default defn is '1')

-L, --noline suppress #line directives in scanner

-P, --prefix=STRING use STRING as prefix instead of "yy"

-R, --reentrant generate a reentrant C scanner

--bison-bridge scanner for bison pure parser.

--bison-locations include yylloc support.

--stdinit initialize yyin/yyout to stdin/stdout

--nounistd do not include <unistd.h>

--wincompat windows compatibility (uses <io.h> instead of <unistd.h> and _isatty, _fileno functions)

--noFUNCTION do not generate a particular FUNCTION

Miscellaneous:

-c do-nothing POSIX option

-n do-nothing POSIX option

-h, --help produce this help message

-V, --version report win_flex version

例1

參考網頁 https://www.cnblogs.com/zhuyingchun/p/9129366.html 中的例子。

（1）創建文本文件“a.l”（即編寫lex程序）

使用文本編輯器創建文件“d:/temp/a.l”，內容如下：

int num_lines=0, num_chars=0;

\n ++num_lines; ++num_chars;

. ++num_chars;

int main()

{

yyin=fopen("d:/temp/a.l","r");

yylex();

fclose(yyin);

printf("lines=%d, chars=%d\n", num_lines, num_chars);

}

int yywrap()

{

return 1;

}

雙百分號“%%”，是lex編譯器的專用字符串，用于區分lex程序文件中的聲明部分、轉換規則（每個規則由模式和動作兩部分組成，模式即正則表達式，動作即程序代碼）、輔助過程（即C語言編寫的函數）。

（2）使用win_flex編譯文件“a.l”

在命令行窗口輸入命令（wincompat參數，命令lex編譯器創建Windows兼容的程序），：

D:\Programs\win_flex_bison-latest\win_flex.exe --wincompat --outfile=d:/temp/a.yy.c d:/temp/a.l 　　　　　　

正常執行后，生成文件“d:/temp/a.yy.c”，這個文件較大，內容較多。

（3）使用C語言編譯器編譯a.yy.c

我使用的C語言編譯器是Visual Studio 2022。進入VS2022的開發者命令行窗口，進入目錄“d:\temp”執行如下編譯命令： cl a.yy.c

命令執行成功后，在目錄中生成文件“a.yy.exe”和“a.yy.obj”.

關于進入VS2022的開發者命令行窗口的方法，可參看網頁 https://www.toutiao.com/article/7063452501693481511/?log_from=394c061792cc1_1660021719133 的相關部分。

（4）運行程序文件“a.yy.exe”

在命令行窗口運行命令“a.yy”，結果如下圖所示：

該程序的運行結果是，對文件中的行數和字符數進行計數。

例2

參考網頁 https://blog.csdn.net/grandpa_pit/article/details/117195547 中的例子。

（1）創建文本文件“a.l”（即編寫lex程序）

使用文本編輯器創建文件“d:/temp/b.l”，內容如下：

#include <stdio.h>

#include <stdlib.h>

int count=0;

delim [" "\n\t\r]

whitespace {delim}+

operator \+|-|\*|\/|:=|>=|<=|#|=|<<|>>|\+\+

delimiter [,\.;\(\)\"\<\>\{\}]

constant ([0-9])+

identfier [A-Za-z]([A-Za-z]|[0-9])*

{reservedWord} {count++;printf("%d\t(rw,%s)\n",count,yytext);}

\"[^\"]*\" {count++;printf("%d\t(ct,%s)\n",count,yytext);}

{operator} { count++;printf("%d\t(op,%s)\n",count,yytext); }

{delimiter} {count++;printf("%d\t(de,%s)\n",count,yytext);}

{constant} {count++;printf("%d\t(ct,%s)\n",count,yytext);}

{identfier} {count++;printf("%d\t(id,%s)\n",count,yytext);}

{whitespace} { /* do nothing*/ }

int main()

{

yyin=fopen("d:/temp/input.txt","r");

yylex();

fclose(yyin);

}

int yywrap()

{

return 1;

}

上述程序中使用的“d:/temp/input.txt”文件內容

#include<iostream>

using namespace std;

int main(){

cout<<"Hello World!"<<a + b=i++;

}

（2）使用win_flex編譯文件“b.l”

在命令行窗口輸入命令（wincompat參數，命令lex編譯器創建Windows兼容的程序），：

D:\Programs\win_flex_bison-latest\win_flex.exe --wincompat --outfile=d:/temp/b.yy.c d:/temp/b.l 　　　　　　

正常執行后，生成文件“d:/temp/b.yy.c”，這個文件較大，內容較多。

（3）使用C語言編譯器編譯b.yy.c

我使用的C語言編譯器是Visual Studio 2022。進入VS2022的開發者命令行窗口，進入目錄“d:\temp”執行如下編譯命令： cl b.yy.c

命令執行成功后，在目錄中生成文件“b.yy.exe”和“b.yy.obj”。

（4）運行程序文件“b.yy.exe”

在命令行窗口運行命令“b.yy”，結果如下圖所示：

但是有個問題，就是在“d:/temp/input.txt”文件的”Hello World”的左邊的雙引號前面添加空格，就會導致程序的結果不同（不符合預期），這個問題原因還找不到。

小結

通過上面兩個例子，可以看出在Windows中使用Lex的步驟為：

（1）編寫lex程序。即擴展名為“.l”的文本文件。

（2）編譯lex程序。即使用win_flex.exe處理lex程序得到擴展名為“.c”的文件。

（3）得到可執行的程序。即使用C語言編譯器，生成擴展名為“.exe”的文件。

（4）運行可執行程序。

起源

bison 來源于 yacc，一個由 Stephen C. Johnson 于 1975 年到 1978 年期間在貝爾實驗室完成的語法分析器生成程序。正如它的名字（yacc 是 yet another compiler compiler 的縮寫）所暗示的那樣，那時很多人都在編寫語法分析器生成程序。Johnson 的工具基于 D. E. Knuth 所研究的語法分析理論（因此 yacc 十分可靠）和方便的輸入語法。這使得 yacc 在 Unix 用戶中非常流行，盡管當時 Unix 所遵循的受限版權使它只能夠被使用在學術界和貝爾系統里。大約在 1985 年，Bob Corbett，一個加州伯克利大學的研究生，使用改進的內部算法再次實現了 yacc 并演變成為伯克利 yacc。由于這個版本比貝爾實驗室的 yacc 更快并且使用了靈活的伯克利許可證，它很快成為最流行的 yacc。來自自由軟件基金會（Free Software Foundation）的 Richard Stallman 改寫了 Corbett 的版本并把它用于 GNU 項目中，在那里，它被添加了大量的新特性并演化成為當前的 bison。bison 現在作為 FSF 的一個項目而被維護，且它基于 GNU 公共許可證進行發布。

在 1975 年，Mike Lesk 和暑期實習生 Eric Schmidt 編寫了 lex，一個詞法分析器生成程序，大部分編程工作由 Schmidt 完成。他們發現 lex 既可以作為一個獨立的工具，也可以作為 Johnson 的 yacc 的協同程序。lex 因此變得十分流行，盡管它運行起來有一點慢并且有很多錯誤。（不過 Schmidt 后來在計算機行業里擁有一份非常成功的事業，他現在，2009年，是 Google 的 CEO。2010 年 CEO 移交了，繼續擔任 Google 董事長。）

大概在 1987 年，Lawrence Berkeley 實驗室的 Vern Paxson 把一種用 ratfor（當時流行的一種擴展的 Fortran 語言）寫成的 lex 版本改寫為 C 語言的，被稱為 flex，意思是“快速詞法分析器生成程序”（Fast Lexical Analyzer Generator）。由于它比 AT&T 的 lex 更快速和可靠，并且就像伯克利的 yacc 那樣基于伯克利許可證，它最終也超越了原來的 lex。flex 現在是 SourceForge 的一個項目，依然基于伯克利許可證。

安裝

大多數 Linux 和 BSD 系統自帶 flex 和 bison 作為系統的基礎部分。如果你的系統沒有包含它們，安裝它們也很容易。

例如在 Ubuntu/Debian 系統，可以直接 apt 安裝：

# Ubuntu 20
$ sudo apt install flex bison -y

$ flex -V
flex 2.6.4
$ bison -V
bison (GNU Bison) 3.5.1

范例

范例請見 https://github.com/ikuokuo/start-ai-compiler/tree/main/books/flex_bison ，都來自結語給出的 Flex & Bison 一書。

范例指導了我們如何使用 Flex & Bison 開發一個計算器，并能支持變量、過程、循環和條件表達式，有內置函數，也支持用戶自定義函數。

如下編譯所有范例：

cd books/flex_bison/

# 編譯 release
make
# 編譯 debug
make debug

# 清理
make clean

范例程序會輸出進 _build 目錄，如下執行：

$ ./_build/linux-x86_64/release/1-5_calc/bin/1-5_calc
> (1+2)*3 + 4/2=11

$ ./_build/linux-x86_64/release/3-5_calc/bin/3-5_calc
> let sq(n)=e=1; while |((t=n/e)-e)>.001 do e=avg(e,t);;
Defined sq
> let avg(a,b)=(a+b)/2;
Defined avg
> sq(10)=3.162
> sqrt(10)=3.162
> sq(10)-sqrt(10)=0.000178

如果只編譯某一范例：

cd ch01/1-1_wc/

# 編譯 release
make -j8
# 編譯 debug
make -j8 args="debug"

# 清理
make clean

程序

Flex 與 Bison 程序都是由三部分構成：定義部分、規則部分和用戶子例程。

... definition section ...
%%
... rules section ...
%%
... user subroutines section ...

Flex 規則部分基于正則表達式，Bison 則基于 BNF (Backus-Naur Form) 文法。詳細用法，請依照結語給出的 Flex & Bison 一書，及范例。

這里不做過多闡述，本文旨在讓大家了解有 Flex 與 Bison 這樣工具，以及它們能幫助我們完成什么樣的工作。

結語

Flex 與 Bison 是詞法分析器（Scanner）與語法分析器（Parser）的自動生成工具，應用了形式語言理論的結果。這些工具同樣可用于文本搜索、網站過濾、文字處理和命令行語言解釋器。

本文內容主要來源于以下書籍：

2011-03 / flex與bison（中文版）[4] / 閱讀[5]
2009 / flex & bison - Text Processing Tools[6] / 閱讀[7]

GoCoding 個人實踐的經驗分享，可關注公眾號！

腳注

[1] sql/sql_yacc.yy: https://github.com/mysql/mysql-server/blob/8.0/sql/sql_yacc.yy

[2] parser/scan.l: https://github.com/postgres/postgres/blob/master/src/backend/parser/scan.l

[3] parser/gram.y: https://github.com/postgres/postgres/blob/master/src/backend/parser/gram.y

[4] 2011-03 / flex與bison（中文版）: https://book.douban.com/subject/6109479/

[5] 閱讀: http://home.ustc.edu.cn/~guoxing/ebooks/flex%E4%B8%8Ebison%E4%B8%AD%E6%96%87%E7%89%88.pdf

[6] 2009 / flex & bison - Text Processing Tools: https://book.douban.com/subject/3568327/

[7] 閱讀: https://web.iitd.ac.in/~sumeet/flex__bison.pdf

操屁眼的视频在线免费看,日本在线综合一区二区,久久在线观看免费视频,欧美日韩精品久久综

例1

例2

小結

相關網頁

起源

安裝

范例

程序

結語

腳注