操屁眼的视频在线免费看,日本在线综合一区二区,久久在线观看免费视频,欧美日韩精品久久综

新聞資訊

    澶ф暟鎹枃鎽樻巿鏉冭漿杞借嚜瀹夎開鐨勫啓浣滈棿

    浣滆€咃細Andy

    浠婃棭涓€璧峰簥灝辯湅鍒癋ran莽ois Chollet澶х錛圞eras浣滆€咃級鍙戞帹錛屾牴鎹?GPT-2涓噺妯″瀷鐨勮秴闀胯窛紱昏蹇嗘兂鍒頒簡涓€縐嶇畝鍗曠殑涓嶅熀浜庢満鍣ㄥ涔犵殑鏂囨湰鐢熸垚鏂瑰紡錛屽眳鐒剁濂囧湴澶嶇幇浜咷PT-2鐨勭粨鏋滐紝鏂規(guī)硶寰堢畝鍗曪紙鍙敤浜?0鍒嗛挓鍐欎唬鐮侊級錛屾瘡嬈$敤瑕佸熀浜庢枃鏈腑鐨勫叧閿瘝錛岃繕鏈夊彞鏈嚑涓瘝錛屽湪璋鋒瓕鐩存帴鎼滅儲錛岀劧鍚庡皢鑾峰彇媯€绱㈢墖孌靛熀浜庢渶鍚庡嚑涓瘝榪炴帴璧鋒潵錛屽彧瑕佽繖鏍蜂笉鍋滃仛鐢氳嚦鑳界敓鎴怗PT-2璁烘枃涓偅涓彂鐜扮濂囩嫭瑙掑吔鐨勪緥瀛愩€?/p>

    鑷充簬浠g爜錛孎ran莽ois寰堝菇榛樺湴璇達細鈥淚 will not be releasing the code, because you guys couldn't handle the power of a Python script cobbled together in 20 minutes with Requests, BeautifulSoup, and regular expressions. It would change algorithmic cyberwar forever.鈥濆張鏄竴涓?鈥淭oo dangerous to release銆傗€濅笉榪囨洿澶氳皟渚冩剰涔夊湪閲屽ご錛岀嫚鐙犲槻璁戒簡OpenAI涓€娉€傝秮姝ゆ満浼氾紝涔熸妸鑷繁鎱㈡參鐮佷簡鍑犲ぉ鐨勭鍙戝嚭鏉ャ€?/p>


    濡傛灉璇碆ERT妯″瀷榪樺緢宸у鍦版彁鍑篗aske Language Model Loss鍔犱笂 Next Sentence Prediction Loss鏉ヨ棰勮緇冩ā鍨嬪鍒版洿鍏ㄩ潰淇℃伅錛岄偅GPT緋誨垪鍒欏氨鍙槸鎶婅矊浼煎鉤娣℃棤濂囩殑Transformer Decoder錛堝崟鍚戣В鐮侊級緇欏姞澶у啀鍔犲ぇ錛屽綋鐒跺ソ鐨勬暟鎹篃涓嶅彲灝戯紝鐒跺悗鍚戝ぇ瀹跺睍紺?span>澶у埌涓€瀹?/span>紼嬪害鍚?GPT-2)闈炲父鍘夊鐨勶紝鐗瑰埆鍦ㄨ璦€鐢熸垚涓婏紝鍒氬ソ濉ˉ浜咮ERT 鐨勭己闄楓€?/p>

    GPT-2鐜板湪濡傛鏈夊悕浼拌涔熸槸鍚稿彇浜嗗墠涓€嬈℃暀璁紝鐩村埌BERT鍑虹幇澶ч儴鍒嗕漢鎵嶇煡閬撴湁涓狦PT錛屼簬鏄疓PT-1瀹岀編鐨勬垚浜咮ERT鐨勫灚鑴氱煶銆傛墍浠ュ緟 GPT-2鍑哄満錛岃櫧鐒惰鏂囨鏂囩煭鐭嚑欏碉紝鍗存槸鍑哄敖椋庡ご錛屼笉鐭ュ叾涓?OpenAI鍏叧鏈夊嚭鍑犲垎鍔涳紝寰呬漢浠棶寮€涓嶅紑婧愬晩錛岀瓟鏇幫細鈥?strong>Too Dangerous to Release錛?/strong>錛堝氨鏄笉緇欎綘浠敤錛侊級鈥?/p>

    姝よ涓€鍑猴紝涓€涓嬫儕璧蜂竴鐗囧弽鍝嶏紝绔嬪埢鍑虹幇浜嗘尯OpenAI媧懼拰鍙峅penAI 媧撅紝鍙屾柟璁鴻瘉鍗佽凍錛岀悍綰峰彂鏂囷紝鍏夐偅鍑犲ぉ鎴戞瘡澶╅兘璧風爜寰楃湅涓婁竴綃囧叧浜?GPT-2浜夎鐨勫崥鏂囥€傝€孏PT-2鏈€灝忕殑117MB(鎸囧弬鏁伴噺)棰勮緇冩ā鍨嬶紝涔熷湪榪欏惖鍚甸椆闂逛腑琚?zhèn)勬?zhèn)勬斁浜嗗嚭鏉ャ€?/p>

    涔嬪悗錛屽伓鏈夊湪Reddit鐪嬪埌鍑犵瘒鍩轟簬117M妯″瀷finetune鐨勫笘錛屽緢鏈夎叮錛屼竴鐩存兂鎵炬椂闂翠篃寮勪釜鐜╃帺錛屽彲鎯滃お蹇欙紝鍓嶆鏃墮棿涓撴敞BERT鍔犱笂搴﹀亣錛屼篃灝辨悂涓嬩簡銆?/p>

    鍥犳錛岀洿鍒板墠鍑犲ぉ錛岃帿鍚嶅彂鐜板叧浜嶨PT-2 finetune鐨勫笘紿佺劧鍙堝彉澶氫簡錛屾墠鍙戠幇OpenAI鍙堟斁鍑轟簡鏇村ぇ鐨勬ā鍨嬶紝涔熷氨鏄繖綃?strong>涓昏浼氱敤鍒扮殑 345M妯″瀷錛堝闇€鐢ㄥ皬妯″瀷錛屽彧闇€灝嗘枃涓?45M鏀逛負117M鍗沖彲錛?/strong>銆傞櫎姝や簩鑰咃紝鏍規(guī)嵁璁烘枃錛屽簲璇ヨ繕鏈変袱涓洿澶фā鍨嬶紝濡傛灉OpenAI鍑嗗鏀懼嚭鐨勮瘽錛屼及璁PT-2榪欎釜姒傚康鑳界倰鏁翠釜2019騫淬€?/p>


    瓚佺潃鐜板湪榪欐嘗鐑疆錛屾€葷畻鏄妸GPT-2浣跨敤鐩稿叧鐨勫簱閮芥祻瑙堜簡涓€閬嶏紝欏轟究鑷繁涔焒inetune浜嗗嚑涓ā鍨嬶紝鍙戠幇鏁堟灉榪樻尯濂界殑銆傛澶栧彂鐜扮綉涓婁篃娌″お澶氬叧浜嶨PT-2浣跨敤鐨勪腑鏂囪祫鏂欙紝鍥犳灝卞垎浜竴涓嬭嚜宸辯粡楠屻€?/p>

    鏈枃緇撴瀯濡備笅錛屽ぇ瀹惰嚜鍙栨墍闇€錛?/p>

    • 棣栧厛錛屾垜浼氬憡璇夊ぇ瀹跺浣曠敤鏇村簳灞傜殑nshepperd鐨刧pt-2搴撴潵 finetune妯″瀷錛?/li>
    • 涔嬪悗錛屼細浠嬬粛濡備綍鐢ㄦ洿涓婂眰鐨刴inimaxir鐨刧pt-2-simple搴撴潵鏇寸畝鍗曞湴finetune妯″瀷錛屼富瑕佺敤Colab鐨凬otebook鏉ユ暀澶у鍏嶈垂韞?GPU鏉inetune妯″瀷錛?/li>
    • 鏈€鍚庯紝鎴戜細浠嬬粛濡備綍鎶婅緇冨ソ鐨勬ā鍨嬬敤t04glovern鐨刧pt-2-flask-api妯″瀷閮ㄧ講鍒版湇鍔″櫒涓婏紝閫氳繃嫻忚鍣ㄨ闂紝杈撳叆鍙ュ瓙璁╂ā鍨嬬畫鍐欍€傝繖閲岃繕浼氱敤鍒癏ugginface鐨刾ytorch-pretrained-BERT鏉ヨ漿鎹㈡ā鍨嬫牸寮忋€?/li>

    鎵€闇€搴揋ithub閾炬帴錛?/p>

    • gpt-2錛歨ttps://github.com/nshepperd/gpt-2
    • gpt-2-simple錛歨ttps://github.com/minimaxir/gpt-2-simple
    • gpt-2-flask-api錛歨ttps://github.com/t04glovern/gpt-2-flask-api
    • pytorch-pretrained-BERT錛歨ttps://github.com/huggingface/pytorch-pretrained-BERT

    鐢ㄥ埌鐨勮緇冩暟鎹槸鎴戜粠緗戜笂鐖笅鏉ョ殑鑰佸弸璁板崄瀛g殑鍓ф湰銆?/p>


    鎺ヤ笅鏉ュ氨璁╂垜浠紑濮嬪惂錛岄粯璁ゅぇ瀹朵細鐢↙inux緋葷粺鏉ユ搷浣溿€?/p>


    鑰佹澘鍏堟潵涓€鐩楪PT-2


    鏁翠釜榪囩▼澶т綋鍒嗗洓姝ワ紝棣栧厛鎴戜滑闇€瑕佸厛Clone涓嬫潵nshepperd鐨刧pt-2 搴擄紝涔嬪悗鍑嗗鏁版嵁涓庢ā鍨嬶紝鐒跺悗鍐峟inetune錛屾渶鍚庣敤淇濆瓨妯″瀷鏉ョ敓鎴愭牱鏈€?/p>

    git clone https://github.com/nshepperd/gpt-2
    pip install -r requirements.txt #瀹夎闇€瑕佺敤鍒扮殑鍖?
    


    榪涘叆鏂囦歡澶癸紝涓嬭澆闇€瑕佺殑棰勮緇冩ā鍨嬶紝榪欓噷鐢ㄥ垰鏀懼嚭鏉ョ殑涓瀷妯″瀷錛屾満鍣ㄤ笉澶熷彲浠ョ敤117M妯″瀷銆?/p>

    python download_model.py 345M
    


    345M妯″瀷姣旇緝澶э紝澶ф1.4涓狦錛屾墍浠ヤ笅杞藉悓鏃跺彲浠ユ潵澶勭悊鏁版嵁銆傚鏋滅敤鎴戞彁渚涚殑鏁版嵁錛岄偅鐩存帴鎷瘋繃鍘誨氨濂戒簡錛屾斁鍦╠ata/涓嬨€傜◢寰湅鐪嬫暟鎹殑鏍峰瓙鍚с€?/p>


    鐒跺悗灝卞彲浠ュ紑濮媐inetune浜嗐€傚鎯寵finetune鏃舵洿蹇簺鐨勮瘽錛屽彲浠ラ緙栫爜鏁版嵁鎴愯緇冩牸寮忋€?/p>

    PYTHONPATH=src./encode.pydata/friends.txt
    data/friends.txt.npz
    


    寮€濮媐inetune鍚э紒

    PYTHONPATH=src ./train.py --dataset data/friends.txt.npz --model_name 345M
    


    鍏朵粬鍊煎緱鍏蟲敞鍙傛暟錛?/p>

    • learning_rate:瀛︿範鐜囷紝榛樿2e-5錛屽彲鏍規(guī)嵁鏁版嵁闆嗗ぇ灝忛€傚綋璋冩暣錛屾暟鎹泦澶х殑璇濆彲浠ヨ皟澶т簺錛屽皬鐨勮瘽鍙互璋冨皬浜涖€?/li>
    • sample_every:姣忓灝戞鐢熸垚涓€涓牱鏈湅鐪嬫晥鏋滐紝榛樿100銆?/li>
    • run_name:褰撳墠璁粌鍛藉悕錛屽垎鍒湪samples鍜宑heckpoint鏂囦歡澶逛笅鍒涘緩褰撳墠鍛藉悕鐨勫瓙鏂囦歡澶癸紝涔嬪悗鐢熸垚鐨勬牱鏈拰淇濆瓨鐨勬ā鍨嬪垎鍒繚瀛樺湪榪欎袱涓瓙鏂囦歡澶廣€傝緇冧腑鏂兂緇х畫璁粌?yōu)鍙互鐢ㄥ悓鏍风殑run_name錛屽鎯寵窇涓嶅悓浠誨姟璇鋒寚瀹氫笉鍚宺un_name.

    鏍規(guī)嵁鏈哄櫒璁粌閫熷害浼氫笉鍚岋紝浣嗗熀鏈笂涓や笁鍗冩灝辮兘鐪嬪埌浜涜繕綆椾笉閿欑殑緇撴灉浜嗐€?/p>


    浜庢槸鎴戜滑灝辨嬁鍒頒簡finetune濂界殑妯″瀷錛屾帴涓嬫潵灝辨潵榪涜濂界帺鐨勭敓鎴愮幆鑺傚惂銆傜涓€姝ラ渶瑕佸皢鐢熸垚鐨勬ā鍨嬶紝鏇存敼鍚嶅瓧錛屾斁鍏odels鏂囦歡澶歸噷錛屾浛鎹㈡帀鍘熸潵鐨勬ā鍨嬶紙涓€瀹氳璁板緱灝嗕箣鍓嶇殑妯″瀷澶囦喚錛侊級銆?/p>

    姣斿璇村皢checkpoint/run1閲岀殑model-4000妯″瀷鍚嶅瓧閮芥敼鎴恗odel.ckpt錛岀劧鍚庣Щ鍏odels/345M閲屽幓銆?/p>

    OK浜?鍏堟槸鑷敱鍙戞尌鐜妭錛岀敤generate_unconditional_samples.py鏉ユ棤鏉′歡鐢熸垚鏍鋒湰銆?/p>

    python src/generate_unconditional_samples.py --top_k 40 --temperature 0.9 --model_name 345M
    


    鐒跺悗鏄懡棰樹綔鏂囷紝鏈夋潯浠朵簰鍔ㄧ敓鎴愮幆鑺傘€?/p>

    python src/interactive_conditional_samples.py --top_k 40 --temperature 0.9 --model_name 345M
    


    榪愯鍚庝細鍑虹幇涓€涓簰鍔ㄦ錛岃緭鍏ヤ綘鎯寵妯″瀷緇啓鐨勮瘽錛岃鎴戞兂鎯?..


    涓嬮潰灝辨槸瑙佽瘉濂囪抗鐨勬椂鍒諱簡... ... ... 濂戒竴浼氬効鍚庯紝褰撳綋


    鍦≧achel loves Andy涓ょ鍚庯紝瀹岀編璺戦錛屼激蹇冿紝涓嶈繃鎰熻鍚庡崐孌佃繕鏄緢鏈夋剰鎬濄€?/p>

    鍏充簬鍙傛暟--topk榪樻湁--temperature錛屼細褰卞搷鐢熸垚鐨勬晥鏋滐紝鍙嚜宸卞皾璇曡皟鑺備竴涓嬶紝涓婇潰渚嬪瓙浣跨敤鐨勬槸涓や釜鎺ㄨ崘璁懼畾銆?/p>

    鍒版finetune涓€涓熀鏈珿PT-2鐨勮繃紼嬪氨瀹屼簡錛屾槸涓嶆槸姣旀兂璞′腑瑕佺畝鍗曞緢澶氥€?/p>

    涓嶈繃涓嬮潰榪樻湁鏇寸畝鍗曠殑鏂規(guī)硶銆?/p>

    綆€涔嬪張綆€錛歡pt-2-simple


    濡傚叾鍚嶏紝gpt-2-simple搴撳氨鏄彲浠ヨ浣犳洿綆€鍗昮inetune鍜岀敓鎴愶紝涓昏鍩轟簬涓婇潰鐨刧pt-2鍐欑殑銆?/p>

    鍏抽敭浣跨敤鏁欑▼錛屾垜鐩存帴灝咰olab Notebbok閮ㄥ垎鍐呭鏀懼湪榪欙紝鏇磋緇嗘煡鐪婲otebook銆傛帹鑽愪嬌鐢∟otebook鏌ョ湅鏁欑▼錛屾湁鍏嶈垂GPU鍙互钖呫€?/p>

    Notebook閾炬帴錛歨ttps://colab.research.google.com/drive/1_kQQ8WCjus9mz0Cf1onVeE1pUG-ulTqA


    鏁翠釜榪囩▼澶т綋鍜屼笂闈竴鏍鳳紝涓嶈繃鍛戒護鏇村姞綆€鍗曚簡銆傚悓鏍峰厛鏄笅杞芥ā鍨嬨€?/p>

    import gpt_2_simple as gpt2
    gpt2.download_gpt2(model_name="345M")
    


    鐒跺悗鏀句笂璁粌鏁版嵁錛屽氨鍙互寮€濮嬭緇冧簡銆?/p>

    sess=gpt2.start_tf_sess()
    gpt2.finetune(sess,
    dataset="friends.txt",
    model_name='345M',
    steps=1000,
    restore_from='fresh',
    print_every=10,
    sample_every=200,
    save_every=500
    )
    


    寰堢洿瑙傦紝鐩存帴璋冪敤gpt2.finetune灝卞彲浠ヤ簡銆?/p>

    gpt2.finetune璁粌鍙傛暟浠嬬粛錛?/p>

    • restore_from:fresh鏄寚浠嶨PT2鍘熸ā鍨嬪紑濮?鑰宭atest鏄粠涔嬪墠 finetune淇濆瓨鐨勬ā鍨嬬戶緇緇?/li>
    • sample_every:姣忓灝戞杈撳嚭鏍鋒湰錛岀湅鐪嬭緇冩晥鏋?/li>
    • print_every:姣忓灝戞鎵撳嵃璁粌鐨勪竴浜涘弬鏁幫紝浠庡乏鍒板彸錛屾鏁般€佹椂闂達紝loss錛屽鉤鍧噇oss
    • learning_rate:瀛︿範鐜?榛樿1e-4,濡傛灉鏁版嵁灝忎簬1MB鐨勮瘽鍙互璋冧綆鍒?e-5)
    • run_name:榪愯鐨勬椂鍊欙紝淇濆瓨妯″瀷鍒癱heckpoint涓嬪瓙鏂囦歡澶癸紝榛樿 run1

    浣犱細鍙戠幇鍜屼笂涓€鑺傚緢澶氬唴瀹歸兘綾諱技銆?/p>

    璁粌鑾峰緱淇濆瓨妯″瀷鍚庯紝鍙堝埌浜嗙敓鎴愮幆鑺傦紝鍏堟妸妯″瀷load榪涙潵銆?/p>

    sess=gpt2.start_tf_sess()
    gpt2.load_gpt2(sess)
    


    鐒跺悗鐢熸垚鏂囨湰銆?/p>

    gpt2.generate(sess)
    


    gpt2.generate閲岄潰涔熸湁寰堝鍙傛暟鍙互璁劇疆錛?/p>

    • length:鐢熸垚鏂囨湰闀垮害(榛樿1023,涔熸槸鍙鏈€澶ч暱搴?
    • temperature:temperature瓚婇珮錛岀敓鎴愯秺闅忔剰銆?榛樿0.7錛屾帹鑽?.7鍒?.0涔嬮棿)
    • top_k:灝嗚緭鍑洪檺瀹氬湪top k閲岄潰(榛樿0錛屼篃灝辨槸涓嶄嬌鐢ㄣ€傛帹鑽愬湪鐢熸垚鏁堟灉宸殑鏃跺€欎嬌鐢紝鍙互璁総op_k=40)
    • truncate:浠庢寚瀹氱鍙烽樁孌電敓鎴愭枃鏈?姣斿璁総runcate='<|endoftext|>',閭d箞灝變細鍙栫涓€涓?<|endoftext|>'鍓嶇殑鏂囨湰浣滀負杈撳嚭).鍙互鍜屼竴涓瘮杈冨皬鐨刲ength鍊兼惌閰嶄嬌鐢?
    • include_prefix:濡傛灉鐢ㄤ簡truncate鍜宨nclude_prefix=False,閭d箞鍦ㄨ繑鍥炴枃鏈腑灝變笉浼氬寘鍚玴refix閲岀殑鏂囨湰銆?/li>


    瑕佸ぇ閲忕敓鎴愭枃鏈殑璇濆彲浠ョ敤gpt2.generate_to_file.


    閮ㄧ講鍒版湇鍔″櫒涓?/strong>


    鏃㈢劧寮勫ソ浜嗘ā鍨嬶紝閭d箞褰撶劧灝辨槸瑕佸紑濮嬬偒鑰€鍟︼紝閮ㄧ講鍒版湇鍔″櫒涓婏紝璁╁皬浼欎即浠粠嫻忚鍣ㄤ篃鑳界洿鎺ヤ簰鍔ㄧ敓鎴愭枃鏈€?/p>

    涓昏鐢ㄥ埌Github涓婄殑gpt-2-flask-api搴擄紝鍙渶瑕佹彁渚涘畠涓€涓璁粌鎴栬€協(xié)inetune濂界殑GPT2妯″瀷錛圚uggingface鐨刾ytorch鏍煎紡錛夈€?/p>


    灝嗘ā鍨嬫枃浠舵斁鍦╩odels/涓嬶紝鍛藉悕涓篻pt2-pytorch_model.bin涔熷彲浠ュ厛鐢ㄥ畠鎻愪緵鐨勫疄渚嬫ā鍨嬫潵鍋氫釜瀹為獙錛?/p>

    mkdir models
    curl --output models/gpt2-pytorch_model.bin https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-pytorch_model.bin
    


    涔嬪悗榪愯python deployment/run_server.py.


    鐒跺悗錛屼細鑾峰緱涓€涓闂鍙o細


    涔嬪悗鐩存帴鐢ㄦ祻瑙堝櫒璁塊棶灝辮浜嗭紝濡傛灉鏄繙紼嬭闂妸0.0.0.0鏀規(guī)垚鏈嶅姟鍣↖P灝卞ソ浜嗐€?/p>


    鐜板湪寰€閲岄潰閿叆鎯寵瀹冪畫鍐欑殑璇濆氨琛岋紝絳変竴浼氬効錛岀粨鏋滃氨鍑烘潵浜嗐€傞粦鑹茬殑鏄敤鎴瘋緭鍏ワ紝綰㈣壊鐨勬槸妯″瀷鐢熸垚銆?/p>


    鏈€鍚庣殑闂錛氬浣曢儴緗茶嚜宸辯殑妯″瀷


    鍥犱負finetune淇濆瓨鐨則ensorflow鐨勬ā鍨嬫枃浠舵牸寮忥紝浣嗚繖涓寘鍙敮鎸?Pytorch鐨勪繚瀛樻ā鍨嬨€傚洜姝ゆ垜浠鍏堝皢tensorflow鐨勬ā鍨嬭漿鎹㈡垚 Pytorch鐨勬ā鍨嬨€?/p>

    榪欓噷鍙互鐢℉uggingface鐨刾ytorch-pretrained-BERT搴撻噷闈㈢殑杞崲鑴氭湰錛屽厛鏍規(guī)嵁鎸囩ず瀹夎搴擄紝涔嬪悗榪愯浠ヤ笅鑴氭湰銆?/p>

    export GPT2_DIR=妯″瀷鎵€鍦ㄦ枃浠跺す

    pytorch_pretrained_bert convert_gpt2_checkpoint $GPT2_DIR/model_name output_dir/ path_to_config/config.json
    


    涓婇潰鍛戒護convert_gpt2_checkpoint鍚庝笁涓弬鏁板垎鍒槸錛岃緭鍏ョ殑 tensorflow妯″瀷璺緞錛岃漿鎹㈣緭鍑虹殑pytorch妯″瀷璺緞錛屾ā鍨嬬殑閰嶇疆鍙傛暟鏂囦歡銆?/p>


    闇€瑕佹敞鎰忕殑鏄紝鍥犱負榪欏嚑涓簱涔嬮棿鐨勪笉緇熶竴錛屾墍浠ヤ笅杞戒笅鏉?45M妯″瀷鐨勮緗枃浠跺湪杞崲鏃朵細鍑洪敊錛岄渶瑕佹坊鍔犱竴浜涘弬鏁般€傚墠闈㈡湁涓嬭澆345M妯″瀷鐨勮瘽錛屼細鍙戠幇妯″瀷鏂囦歡澶逛笅鏈変竴涓緗枃浠秇params.json銆?/p>

    cp hparams.json hparams_convert.json#澶嶅埗涓€浠芥潵淇敼涔嬪悗鍦╤params_convert.json閲屾坊鍔犲嚑涓弬鏁幫紝鏀規(guī)垚涓嬮潰榪欐牱錛?/p>


    { 
    "n_vocab": 50257,
    "n_ctx": 1024,
    "n_embd": 1024,
    "n_head": 16, 
    "n_layer": 24, 
    "vocab_size":50257,
    "n_positions":1024,
    "layer_norm_epsilon":1e-5,
    "initializer_range": 0.02
    }
    


    灝嗚繖涓緗枃浠舵寚瀹氬埌杞崲鍛戒護convert_gpt2_checkpoint鍚庨潰鐩稿簲鍙傛暟鍘匯€?/p>

    鑾峰緱杞崲妯″瀷鍚庯紝鎶婂畠鏀懼叆models/涓幓錛屽茍涓旈噸鍛藉悕錛屼箣鍚庢妸deployment/GPT2/config.py閲岄潰鐨勫弬鏁拌瀹氭敼鎴?45M澶фā鍨嬬殑鍙傛暟灝卞ソ浜嗐€?/p>


    class GPT2Config(object):
    def __init__(
    self,
    vocab_size_or_config_json_file=50257,
    n_positions=1024,
    n_ctx=1024,
    n_embd=1024,
    n_layer=24,
    n_head=16,
    layer_norm_epsilon=1e-5,
    initializer_range=0.02,
    ):
    


    鏈€鍚庤繍琛宺un_server.py錛屾垚鍔熻澆鍏ユā鍨嬶紝閮ㄧ講瀹屾垚錛佷箣鍚庢祴璇曚竴涓嬶紝鍙戠幇紜疄鏄凡緇廸inetune濂界殑鑰佸弸璁版ā鍨嬨€?/p>

    闅忕潃鏃墮棿鐨勬帹縐伙紝澶фā鍨嬬殑璁粌鎴愭湰鍐嶉檷錛屽浠婂彧闇€鍑犵櫨緹庡厓錛屽氨鍙互澶嶇幇 GPT-2銆?/span>


    緙栬瘧 | 鑻忓畵
    鍑哄搧 | CSDN錛圛D錛欳SDNnews錛?/span>

    OpenAI 鍦?2019 騫存帹鍑轟簡 GPT-2 鏃訛紝鎹濯?Tom鈥榮 Hardware 鎶ラ亾縐幫紝褰撴椂璁粌璐圭敤涓烘瘡灝忔椂 256 緹庡厓銆?/span>濡備粖浜斿勾榪囧幓浜嗭紝闅忕潃 GPT-4 浠ュ強鏃楄埌綰?GPT-4o 鐨勫埌鏉ワ紝AI 澶фā鍨嬬殑璁粌鎴愭湰鏄惁闄嶄簡錛?/span>

    瀵規(guī)錛岀壒鏂媺鍓?AI 鎬葷洃銆丱penAI 鑱斿悎鍒涘浜?Andrej Karpathy 浜庤繎鏃?span style='font-family: -apple-system-font, system-ui, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 1px;text-align: start;text-wrap: wrap;background-color: rgb(255, 255, 255);'>閲嶇幇 GPT-2 欏圭洰涔嬪悗緇欏嚭浜嗗叿浣撶殑絳旀錛屽叾琛ㄧず錛屻€屼粖澶╋紝浣犲彲浠ヨ姳璐?/span>綰?672 緹庡厓璁粌鑷繁鐨勬ā鍨嬶紝鍦ㄤ竴涓?8XH100 GPU 鑺傜偣涓婅繍琛?24 灝忔椂銆傘€?/span>浜嬪疄璇佹槑錛岀‖浠躲€佽蔣浠跺拰鏁版嵁鏂歸潰鐨勮繘姝ユ剰鍛崇潃璁粌鍚屼竴涓ā鍨嬫墍闇€鐨勬椂闂村拰閲戦挶閮戒細鍑忓皯銆?/span>

    涓庢鍚屾椂錛孉ndrej Karpathy 榪樺湪鑷繁鐨?GitHub 欏圭洰欏甸潰錛坔ttps://github.com/karpathy/llm.c/discussions/677錛変腑鍒嗕韓浜嗘暣涓噸鐜扮殑榪囩▼錛屾垜浠笉濡ㄦ潵鐪嬬湅榪欎綅澶х鏄€庝箞鍋氱殑銆?/span>


    鍙敤 672 緹庡厓鐨勪環(huán)鏍煎湪 24 灝忔椂鍐呴噸鐜?GPT-2 妯″瀷

    鍊煎緱涓€鎻愮殑鏄紝Andrej Karpathy 浜庝粖騫?2 鏈堝甯冧粠 OpenAI 鍙嬪ソ鍦扮鑱屽悗錛屾病澶氫箙錛屼粬灝?span style='color: rgb(51, 51, 51);font-family: -apple-system-font, system-ui, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 15px;letter-spacing: 1px;text-align: start;text-wrap: wrap;outline: 0px;visibility: visible;'>甯︽潵浜嗚嚜宸卞緬鎵嬬紪鍐欑殑 1000 琛?C 浠g爜鍗沖疄鐜?GPT-2 璁粌鐨勬柊欏圭洰鈥斺€?span style="outline: 0px;background-color: rgb(255, 255, 255);visibility: visible;">LLM.c錛坔ttps://github.com/karpathy/llm.c錛夈€?/span>鍦ㄨ繖涓」鐩熀紜€涓婏紝Andrej Karpathy 鏃朵笅鐩存帴閲嶇幇浜嗗畬鏁寸殑 15.58 浜夸釜鍙傛暟鐨?GPT-2 欏圭洰錛屽氨鏄郊鏃?OpenAI 鍦ㄣ€婃洿濂界殑璇█妯″瀷鍙婂叾褰卞搷銆嬶紙https://openai.com/index/better-language-models/錛変腑浠嬬粛鐨勯偅涓?GPT-2銆?/span>

    Andrej Karpathy 琛ㄧず錛宭lm.c 鐩存帴鍦?C/CUDA 涓畬鎴愶紙鍏辯害 5000 琛屼唬鐮侊級錛岃€屼笉闇€瑕佷紶緇熺殑璁粌鏍堬紝璇ュ爢鏍堟秹鍙婂埌浜?Python 瑙i噴鍣ㄥ拰 PyTorch/JAX銆乭uggingface/transformers 絳夋槑鏄炬洿澶嶆潅鐨勬繁搴﹀涔犲簱銆?/span>

    2019 騫達紝璁粌 GPT-2 鏄竴涓渶瑕佹暣涓洟闃熷弬涓庣殑欏圭洰錛岃璁や負鏄竴嬈″ぇ鍨嬫ā鍨嬭繍琛屽疄璺碉紝浣?5 騫村悗鐨勪粖澶╋紝鐢變簬璁$畻錛圚100 GPU錛夈€佽蔣浠訛紙CUDA銆乧uBLAS銆乧uDNN銆丗lashAttention錛夊拰鏁版嵁錛堝 FineWeb-Edu 鏁版嵁闆嗭級鐨勬敼榪涳紝浠栦滑鍋氬埌浜嗗彲浠ュ湪鍗曚釜 8XH100 鑺傜偣涓婁互 672 緹庡厓鐨勪環(huán)鏍煎湪 24 灝忔椂鍐呴噸鐜拌繖涓ā鍨嬨€?/span>

    鈥滆繖鏄潪甯鎬笉鍙€濊鐨勨€濓紝Andrej Karpathy 璇撮亾銆備笉榪囷紝榪欏叾涓篃鏈変竴浜涙敞鎰忎簨欏瑰拰鎸戞垬鈥斺€攍lm.c 浠嶆湭寰楀埌瀹岀編璋冩暣鍜屽厖鍒嗙ǔ瀹氾紙鎴戜滑浠嶆椂涓嶆椂浼氱湅鍒?loss 宄板€煎拰涓嶈壇嬋€媧昏寖鍥達級錛岃€屼笖璇勪及涔熶笉澶熷叏闈紙渚嬪錛屾病鏈変粩緇嗚瘎浼板璇█銆佷唬鐮佸拰鏁板錛夈€?/span>


    澶嶇幇鍑嗗宸ヤ綔

    Andrej Karpathy 鍒嗕韓閬擄紝浣跨敤 llm.c 璁粌 GPT-2 闈炲父綆€鍗曪紝鍥犱負瀹冩槸鐢?C/CUDA 緙栧啓鐨勶紝鎵€浠ヤ笉闇€瑕?minconda銆丳ython銆丳yTorch 絳夎蔣浠躲€?/span>

    浣犲彧闇€瑕佷竴涓?8XH100 GPU銆?/span>

    涓嶈繃錛宭lm.c 鐨勮綆楁柟寮忓緢鐏墊椿鈥斺€斿鏋滀綘鍙湁 1 涓?GPU錛屼綘浠嶇劧鍙互鑾峰緱 GPT-2錛屽彧鏄渶瑕佺瓑寰?8 澶╋紝鑰屼笉鏄?1 澶┿€傚鏋滀綘鏈?16 涓?GPU錛堜緥濡備嬌鐢ㄦ柊鐨?Lambda 1 Click Clusters錛夛紝浣犲氨鍙互榪涜澶氳妭鐐硅緇冿紝鍙渶絳夊緟 12 涓皬鏃躲€?/span>

    鍚姩鑺傜偣鍚庯紝浠ヤ笅鏄緇?GPT-2 鐨勫畬鏁磋鏄庯紙浠庣┖鐧芥鍒板紑濮嬫墽琛屽彧闇€綰?1 鍒嗛挓鐨勬椂闂達級錛?/span>

    • # install cudnn so we can use FlashAttention and run fast (optional)# https://developer.nvidia.com/cudnn-downloads# for me, CUDA 12 (run `nvcc --version`) running on Linux x86_64 Ubuntu 22.04wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.debsudo dpkg -i cuda-keyring_1.1-1_all.debsudo apt-get updatesudo apt-get -y install libcudnn9-dev-cuda-12

      # "install" cudnn-frontend to ~/git clone https://github.com/NVIDIA/cudnn-frontend.git

      # install MPI (optional, if you intend to use multiple GPUs)# (you might also have to install NVIDIA NCCL if it doesn't come with your setup)sudo apt -y install openmpi-bin openmpi-doc libopenmpi-dev

      # download and enter llm.c repogit clone https://github.com/karpathy/llm.c.gitcd llm.c

      # download the "starter pack" (~1GB download)# contains GPT2-124M weights (used in tests), tokenizer, eval data .bin s./dev/download_starter_pack.sh

      # download the training dataset (FineWeb-Edu 100B token) .bin data shards# note: this is a total of 1001 data shards. If you only want to test things# out and don't want to do an actual run, feel free to append the number of# training shards to download (e.g. for just 10 shards: ./edu_fineweb.sh 10)# the full dataset is ~200GB, we can store it here in dev/data directory.cd dev/data./edu_fineweb.sh

      # compile (~1 min 1st time for cuDNN mostly, few sec from then on)cd ../../make train_gpt2cu USE_CUDNN=1

      # and train! (wait 24 hours here)mpirun -np 8 ./train_gpt2cu \-i "dev/data/edu_fineweb100B/edu_fineweb_train_*.bin" \-j "dev/data/edu_fineweb100B/edu_fineweb_val_*.bin" \-o "log_gpt2_1558M" \-v 250 -s 300000 -g 384 \-h 1 \-b 16 -t 1024 \-d 1048576 \-r 0 \-z 1 \-c 0.1 \-k "cosine" \-l 0.0006 \-q 0.1 \-u 700 \-n 2000 \-x 32000 \-ge 1 \-y 1 \ -e "d48"

      鎺ヤ笅鏉ワ紝浣犲皢鐪嬪埌涓€鍫嗘墦鍗版粴鍔紝鐒跺悗浼樺寲灝嗗紑濮嬶細

      • num_parameters: 1557686400=> bytes: 3115372800allocated 2971 MiB for model parametersbatch_size B=16 * seq_len T=1024 * num_processes=8 and total_batch_size=1048576=> setting grad_accum_steps=8created directory: log_gpt2_1558Mallocating 40409 MiB for activationsval loss 11.129390allocating 2971 MiB for parameter gradientsallocating 742 MiB for AdamW optimizer state mallocating 742 MiB for AdamW optimizer state vallocating 742 MiB for master copy of paramsstep 1/32000 | loss 11.133732 (+nanz)| norm 52.9732 (+nanz)| lr 8.57e-07 | 3056.36 ms | 42.6% bf16 MFU | 343080 tok/sstep 2/32000 | loss 10.539388 (+nanz)| norm 43.5996 (+nanz)| lr 1.71e-06 | 2747.19 ms | 47.4% bf16 MFU | 381690 tok/sstep 3/32000 | loss 9.894109 (+nanz)| norm 23.2229 (+nanz)| lr 2.57e-06 | 2753.25 ms | 47.3% bf16 MFU | 381259 tok/sstep 4/32000 | loss 9.566241 (+nanz)| norm 28.4920 (+nanz)| lr 3.43e-06 | 2741.47 ms | 47.5% bf16 MFU | 381690 tok/sstep 5/32000 | loss 9.482848 (+nanz)| norm 23.7817 (+nanz)| lr 4.29e-06 | 2752.07 ms | 47.3% bf16 MFU | 381507 tok/sstep 6/32000 | loss 9.332832 (+nanz)| norm 15.9113 (+nanz)| lr 5.14e-06 | 2751.01 ms | 47.3% bf16 MFU | 381431 tok/sstep 7/32000 | loss 9.165650 (+nanz)| norm 10.5941 (+nanz)| lr 6.00e-06 | 2753.03 ms | 47.3% bf16 MFU | 381327 tok/sstep 8/32000 | loss 9.132234 (+nanz)| norm 16.2733 (+nanz)| lr 6.86e-06 | 2748.91 ms | 47.3% bf16 MFU | 381348 tok/sstep 9/32000 | loss 9.097384 (+nanz)| norm 12.1342 (+nanz)| lr 7.71e-06 | 2748.73 ms | 47.3% bf16 MFU | 381367 tok/sstep 10/32000 | loss 9.072879 (+nanz)| norm 10.5923 (+nanz)| lr 8.57e-06 | 2749.40 ms | 47.3% bf16 MFU | 381369 tok/s...

        鍙互鐪嬪埌錛屾瘡涓€姝ラ鐨勬椂闂寸害涓?2.75 縐掞紝涓€鍏辨湁 32000 涓楠わ紝鎵€浠ョ幇鍦ㄦ垜浠絳夊緟綰?24 灝忔椂銆?/span>

        鍦ㄦ瘡涓€姝ラ涓紝璁粌榪愯閮戒細浠?FineWeb-EDU 錛堣繖浜涢兘鏄簰鑱旂綉涓婄殑鏁欒偛緗戦〉錛変腑鎶藉彇綰?100 涓囦釜 token錛屽茍鏇存柊妯″瀷鐨?1.558 浜夸釜鏉冮噸錛屼互渚挎洿濂藉湴棰勬祴搴忓垪涓殑涓嬩竴涓?token銆?/span>

        鏈€鍚庯紝鎬誨叡澶勭悊浜?32,000 * 1048576=336 浜夸釜 token銆傞殢鐫€鑳芥洿濂藉湴棰勬祴涓嬩竴涓?token錛宭oss 涔熶細闅忎箣鍑忓皯銆傚父妯″皢紼沖畾鍦?0.1-1 宸﹀彸錛屽涔犵巼鍦ㄥ墠鍑犳涓緱鍒頒簡棰勭儹銆傛墍浠ワ紝榪欓噷鐨勬ā鍨嬪崟鍏冨埄鐢ㄧ巼錛圡FU錛夌害涓?50%錛屽嵆鐩稿綋楂樻晥銆?/span>

        鐜板湪絳夊緟 24 灝忔椂鍚庯紝鍙互浣跨敤 dev/vislog.ipynb jupyter notebook 鏌ョ湅 main.log 鏃ュ織鏂囦歡銆備負姝わ紝浣犺繕闇€瑕佸畨瑁?Python 鍜?matplotlib銆?/span>


        楠岃瘉涓庤瘎浼?/span>

        鏍規(guī)嵁涓婂浘鎵€紺猴紝宸﹁竟榪借釜鐨勬槸 FineWeb-EDU 楠岃瘉鏁版嵁鐨?loss銆傚鏋滃彧榪愯 OpenAI 鍙戝竷鐨?GPT-2 騫惰瘎浼板叾鍦ㄨ繖涓€鏁版嵁涓婄殑 loss錛屽氨浼氬緱鍒扮孩鑹叉按騫崇嚎錛坙oss 涓?2.83錛夈€?/span>

        瀵規(guī)瘮涔嬩笅錛?span style='font-family: -apple-system-font, system-ui, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 1px;text-align: start;text-wrap: wrap;background-color: rgb(255, 255, 255);'>Andrej Karpathy 妯″瀷鐨勮繍琛岄€熷害寰堝揩灝辮秴榪囦簡瀹冿紝姝ラ暱綰︿負 5,000銆?/span>

        涓嶈繃錛?span style='font-family: -apple-system-font, system-ui, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 1px;text-align: start;text-wrap: wrap;background-color: rgb(255, 255, 255);'>Andrej Karpath 鍧﹁█錛?/span>榪欑姣旇緝騫朵笉鍏鉤錛屽洜涓?GPT-2 鏄湪浠庢湭鍙戝竷鐨?WebText 鏁版嵁闆嗕笂璁粌鐨勶紝鍥犳鍙兘瀛樺湪杈冨ぇ鐨勫垎甯冨亸縐匯€傚洜姝わ紝涓句緥鏉ヨ錛屽鏋滀互 LR 1e-4 瀵?OpenAI 妯″瀷榪涜 1000 姝ョ殑寰皟錛宭oss 浼氳繀閫熶笅闄嶅埌钃濈嚎錛堟崯澶變負 2.61錛夛紝鍥犱負瀹冩鍦ㄥ揩閫熼€傚簲鏂扮殑鏁版嵁緇熻銆?/span>

        鈥滄垜鍠滄鎶婇獙璇?loss 鐪嬫垚鏄悊鏅虹殑媯€鏌ワ紝浣嗗疄闄呮瘮杈冩椂錛屾垜浠繕鏄鐪嬪浐瀹氱殑絎笁鏂硅瘎浼扮粨鏋溿€侶ellaSwag 璇勪及鏄竴縐嶈〃鐜拌壇濂姐€佸鉤紼熾€佸父瑙併€佺粡甯歌寮曠敤鐨勮瘎浼幫紝瀹冭繕鑳芥彁渚涙棭鏈熶俊鍙楓€傝繖浜涢兘鏄畝鍗曠殑甯歌瘑鎬у満鏅紝妯″瀷蹇呴』閫夋嫨姝g‘鐨勫歡緇€濓紝Andrej Karpath鍐欓亾銆?/span>

        鍦ㄥ彸渚х獥鏍間腑瀵?HellaSwag 榪涜璇勪及錛屽彲浠ョ湅鍒?llm.c 妯″瀷鍦ㄥぇ綰?25K 姝ュ乏鍙寵秴瓚婁簡 GPT-2 妯″瀷錛堟棭浜?GPT-2錛屾嵁浼拌 GPT-2 鏄湪 ~100B 涓瘝緇勪笂璁粌鍑烘潵鐨勩€傝繖鍙兘涓庢暟鎹川閲忕殑鎻愰珮鏈夊叧錛孉ndrej Karpath 縐幫紝鍦ㄤ箣鍓嶇殑 124M 榪愯涓篃瑙傚療鍒頒簡榪欎竴鐐癸級銆?/span>

        緇跨嚎鏄浉鍚岃妯$殑 GPT-3 妯″瀷錛屽畠鐨勬ā鍨嬫灦鏋勪笌 GPT-2 鍩烘湰鐩稿悓錛屽彧鏄暐鏈変笉鍚岋紙涓婁笅鏂囬暱搴︿負 1024 -> 2048錛夛紝浣嗚緇冧簡 3 浜夸釜 token錛堝嵆姣旀垜浠湪榪欓噷璁粌鐨?token 澶?10 鍊嶏級銆?/span>

        Andrej Karpath 琛ㄧず錛屻€屾垜鎯寵鐨勬槸錛屽嵆浣挎槸 HellaSwag 涔熶笉鏄竴涓悊鎯崇殑鍗曠偣姣旇緝閫夐」錛屽洜涓哄畠嫻嬭瘯鐨勬槸綆€鍗曠殑鑻辮鍜屽父璇嗭紝鑰屼笉鏄璇█銆佹暟瀛︽垨浠g爜銆傚彲鑳芥槸 WebText 鏁版嵁闆嗗湪榪欎簺鏂歸潰鐨勬潈閲嶈緝澶э紝鑰岃繖浜涢鍩熷湪鏌愮紼嬪害涓?紿冨彇"浜嗘ā鍨嬬殑鑳藉姏錛屾垜浠笉寰楄€岀煡錛屽洜涓哄畠浠庢湭鍙戝竷榪囥€傛渶鍚庯紝涓€鑸潵璇達紝鍦?GPT-2 榪欐牱鐨勪綆妯″瀷鑳藉姏涓嬶紝濂界殑璇勪及緇撴灉鏇撮毦錛屽洜涓烘ā鍨嬩笉鐞嗚В澶氶」閫夋嫨錛岃€屼笖瀹冧滑鐨勬牱鏈川閲忎笉澶熼珮錛屾棤娉曞湪鏍囧噯鏁板鎴栦唬鐮?evals 涓彇寰楅珮浜庡伓鐒舵€х殑鏁堟灉銆傘€?/span>


        鍙傛暟鎸囧崡

        璁╂垜浠潵璇︾粏浜嗚В涓€涓嬬幇鍦ㄤ紶鍏ヨ緇冪殑鍙傛暟銆侽penAI 鍙戝竷鐨?GPT-2 鍖呭惈妯″瀷鏉冮噸錛屼絾緇嗚妭寰堝皯錛涜€?GPT-3 娌℃湁鏉冮噸錛屼絾緇嗚妭寰堝銆傚洜姝わ紝鍦ㄥ緢澶氭儏鍐典笅錛?span style='font-family: -apple-system-font, system-ui, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 1px;text-align: start;text-wrap: wrap;background-color: rgb(255, 255, 255);'>Andrej Karpath 閲囩敤 GPT-3 璁烘枃涓殑瓚呭弬鏁幫紝鍥犱負 GPT-2 璁烘枃涓殑淇℃伅闈炲父闈炲父灝戯細

        • mpirun -np 8 ./train_gpt2cu 錛堝惎鍔ㄥ懡浠わ細浣跨敤 mpi 鍚姩 8 涓繘紼嬶紝姣忎釜榪涚▼鍦?1 涓?GPU 涓婅繍琛岃緇冿紝鍦ㄨ繖涓?8XH100 鑺傜偣涓婂叡鏈?8 涓?GPU錛夈€傚鏋滄湁 4 涓?GPU錛岃浣跨敤 -np 4銆傚鏋滃彧鏈?1 涓?GPU錛屽垯鍙互璺寵繃 mpi錛屽嵆鍙渶灝嗗叾鏇存敼涓?./train_gpt2cu銆?/span>

        • -i -j 鏄緇冨拰楠岃瘉鍒嗗壊鏍囪鏂囦歡錛岄€氳繃 edu_fineweb.sh 涓嬭澆

        • -o 鏄皢鏃ュ織鍜屾鏌ョ偣鍐欏叆鐨勮緭鍑虹洰褰?/span>

        • -v 250 瑕佹眰姣?250 姝ヨ瘎浼板茍璁板綍楠岃瘉loss

        • -s 300000 瑕佹眰姣?300000 姝ラ噰鏍蜂竴浜?token銆傜敱浜庢€繪鏁板皢灝忎簬姝ゅ€鹼紝鍥犳榪欐槸涓€縐嶅叧闂噰鏍風殑綆€渚挎柟娉曪紝鍙細鍦ㄦ渶鍚庨噰鏍蜂竴嬈°€?/span>

        • -g 384 璁劇疆鏈€鍚庨噰鏍風殑鏍囪鏁頒負 384

        • -h 1 瑕佹眰璇勪及 HellaSwag 鐨勫噯紜€?/span>

        • -b 16 灝嗗井鎵瑰ぇ灝忚緗負 16銆傚鏋滃唴瀛樹笉瓚籌紝鍙互鍑忓皬璇ュ€鹼紝渚嬪灝濊瘯 8銆?銆?錛岀洿鑷?1銆?/span>

        • -t 1024 灝嗘渶澶у簭鍒楅暱搴﹁緗負 1024錛屼笌 GPT-2 鐩稿悓銆?/span>

        • -d 1048576 鎸夌収 GPT-3 鐨勮秴鍙傛暟琛紝瑕佹眰鎬繪壒嬈″ぇ灝忎負 20 鐨?2 嬈℃柟銆備唬鐮佸皢紜繚婊¤凍鎵€闇€鐨勬€繪壒嬈″ぇ灝忥紝騫惰綆椾紭鍖?"鍐呭驚鐜?"姝ラ鎵€闇€鐨勬搴︾瘡縐€備緥濡傦紝鍦ㄤ笂闈㈡垜浠湅鍒版湁 8 涓?GPU錛屾瘡涓?GPU 澶勭悊 16 X 1024 涓唬甯侊紝閭d箞姣忎釜寰錛堝崟嬈″墠榪涘悗閫€錛夊氨鏄?8 X 16 X 1024=131,072 涓唬甯侊紝鍥犳浠g爜璁$畻鍑烘搴︾瘡縐楠や負 8錛屼互婊¤凍姣忔鎵€闇€鐨?100 涓囨壒嬈″ぇ灝忥紝鍗沖墠榪?鍚庨€€ 8 嬈★紝鐒跺悗鍗曟鏇存柊銆?/span>

        • -r 0 璁劇疆閲嶆柊璁$畻涓洪浂銆傞噸鏂拌綆楁槸涓€縐嶆潈琛¤綆楀拰鍐呭瓨鐨勬柟娉曘€傚鏋?-r 涓?1錛岄偅涔堟垜浠皢鍦ㄥ悗鍚戣繃紼嬩腑閲嶆柊璁$畻鍓嶅悜榪囩▼鐨勪竴閮ㄥ垎錛圙eLU錛夈€傝繖鎰忓懗鐫€鎴戜滑涓嶅繀緙撳瓨瀹冿紝浠庤€岃妭鐪佷簡鍐呭瓨錛屼絾浠d環(huán)鏄渶瑕佹洿澶氱殑璁$畻閲忋€傚洜姝わ紝濡傛灉鍐呭瓨涓嶈凍錛屽彲浠ヨ瘯璇?-r 1 鎴?-r 2錛堜篃浼氶噸鏂拌綆楀竷灞€錛夈€?/span>

        • -z 1 鍦ㄥ涓?GPU 涓婂紑鍚?ZeRO-1錛堝嵆浼樺寲鍣ㄧ姸鎬佸垎鐗囷級銆傚鏋滆緇冧嬌鐢ㄧ殑 GPU 瓚呰繃 1 涓紝鍒欐棤闇€鑰冭檻姝よ緗紝鍩烘湰涓婂簲濮嬬粓寮€鍚€傚湪浣跨敤 1 涓?GPU 鐨勬儏鍐典笅錛屾璁劇疆涓烘棤鏁堛€?/span>

        • -c 0.1 灝嗘潈閲嶈“鍑忚緗負 0.1銆傚彧鏈夛紙2D錛夋潈閲嶇殑琛板噺涓?GPT-2 瀹屽叏鐩稿悓錛岃繖涓暟瀛楁潵鑷?GPT-3 璁烘枃銆?/span>

        • -k "浣欏雞"璁劇疆浣欏雞瀛︿範鐜囪鍒掞紝榪欐槸榛樿璁劇疆銆?/span>

        • -l 0.0006 灝嗘渶澶у涔犵巼璁劇疆涓?6e-4銆侴PT-3 鐨勮鏂囦腑璇磋妯″瀷澶у皬搴斾嬌鐢?2e-4錛屼絾鍦ㄨ繖閲屼嬌鐢ㄤ簡涓夊€嶇殑瀛︿範鐜囷紝浼間箮璁粌閫熷害鏇村揩錛岃€屼笖娌℃湁浠諱綍闂銆傝繖榪樻病鏈夌粡榪囦粩緇嗚皟鏁淬€?/span>

        • -Q 0.1 琛ㄧず鍦ㄨ緇冭繃紼嬩腑錛屽皢鎶婂涔犵巼琛板噺鍒版渶澶?LR 鐨?10%錛岃繖涓?GPT-3 璁烘枃涓€鑷淬€?/span>

        • -u 700 琛ㄧず灝嗗湪鍓?700 嬈¤凱浠d腑灝嗗涔犵巼浠?0 鎻愬崌鍒版渶澶у涔犵巼錛屾寜鐓?GPT-3 璁烘枃鐨勮姹傦紝鍦ㄦ€繪壒嬈″ぇ灝忎負 0.5M 鏃訛紝瀛︿範鐜囦負 3.5 浜夸釜 token銆?/span>

        • -n 2000 瑕佹眰姣?2000 姝ヤ繚瀛樻ā鍨嬫鏌ョ偣銆?/span>

        • -x 32000 瑕佹眰鎬繪鏁頒負 32K 姝ャ€備箣鎵€浠ラ€夋嫨榪欎釜鏁板瓧錛屾槸鍥犱負瀹冩槸涓€涓緢濂界殑鏁板瓧錛岃€屼笖姝eソ閫傚悎 24 灝忔椂銆?/span>

        • -ge 1 涓?CublasLt 璁劇疆涓€涓柊榪戝悎騫剁殑 gelu 閲嶆柊璁$畻璁劇疆錛堝彲閫夛級

        • -y 1 璁劇疆"鎭㈠"鏍囧織銆傚鏋滀綘鐨勮緇冨洜鏁呭穿婧冩垨鎸傝搗錛屼綘鍙互 CTRL+C 騫墮噸鏂拌繍琛岃繖鏉″懡浠わ紝瀹冧細灝濊瘯鎭㈠浼樺寲銆俵lm.c 鏄?bit 紜畾鐨勶紝鎵€浠ヤ綘浼氬緱鍒頒笌娌℃湁宕╂簝鏃剁浉鍚岀殑緇撴灉銆?/span>

        • -e "d48" 瑕佹眰浠庡ご鍒濆鍖栦竴涓繁搴︿負 48 鐨?GPT-2 妯″瀷銆?/span>


        鍐呭瓨鎸囧崡

        澶у鏁頒漢鍙兘闈復鐨勬渶澶ч檺鍒舵槸浠栦滑鐨?GPU 娌℃湁 80GB 鐨勫唴瀛樸€?/span>

        Andrej Karpath 琛ㄧず錛屸€?/span>娌″叧緋伙紝濡傛灉浣犳湁鑰愬績錛屼粛鐒跺彲浠ヨ繍琛屼笂榪版墍鏈夊唴瀹癸紝鍙槸閫熷害浼氬彉鎱€傛墍浠ワ紝濡傛灉妯″瀷涓嶉€傚悎錛屼綘鍙互璋冩暣浠€涔堝憿錛熸渶閲嶈鐨勬槸璋冩暣寰壒嬈″ぇ灝?-b銆?span style='text-wrap: wrap;font-family: -apple-system-font, system-ui, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 1px;text-align: start;'>灝濊瘯鍑忓皬瀹冿紝浣嗚淇濇寔鏁撮綈鐨勬暟瀛椼€備緥濡傦紝浠?16 鍒?8錛屽啀鍒?4錛?錛?銆?/span>

        鍦ㄦ鍩虹涓婏紝榪樺彲浠ュ皾璇曡皟鏁撮噸璁$畻璁劇疆 -r錛屽€間負 0錛堟渶蹇紝鍐呭瓨鍗犵敤澶э級銆?錛堢◢寰參涓€鐐癸紝浣嗚妭鐪佸ぇ閲忓唴瀛橈級鎴?2錛堢◢寰參涓€鐐癸紝鑺傜渷杈冨皯鍐呭瓨錛夈€?/span>

        鎺ヤ笅鏉ュ彲浠ュ仛鐨勬槸紱佺敤 fp32 涓殑涓繪潈閲嶏紝浣跨敤 -w 0錛堥粯璁ゅ€間負1錛夊彲浠ュ仛鍒拌繖涓€鐐廣€傗€?/span>

        Karpath 縐幫紝浠栦滑涓嶄細緇存姢 fp32 鐨勫弬鏁板壇鏈€傚湪涔嬪墠鐨勫嚑嬈¤繍琛屼腑錛岀粡楠岃〃鏄庤繖鏍峰仛鏄彲浠ョ殑錛屽彲鑳芥槸鐢變簬浠栦滑浣跨敤浜嗛殢鏈鴻垗鍏ャ€傚鏋滃嵆渚胯繖鏍蜂篃涓嶅悎閫傦紝浣犲彲浠ュ皾璇曢檷浣庢渶澶у簭鍒楅暱搴?-t錛岄粯璁ゅ€間負1024錛屽彲浠ラ檷鍒?12錛?56絳夛紝浣嗚繖浼氫嬌浣犵殑妯″瀷鍙樺緱鏇寸碂錛屽洜涓轟綘闄嶄綆浜嗗畠鐨勬渶澶ф敞鎰忚寖鍥淬€?/span>


        浠g爜

        鈥滄鏃犵枒闂紝鎴戝彲鑳芥湁鍋忚錛屼絾 llm.c 紜疄寰堜紭闆呪€濓紝Karpathy 琛ㄧず錛?/span>

        • 瀹冭繍琛屽彧闇€瑕佸熀鏈殑 CUDA 渚濊禆欏廣€?/span>

        • 瀹冩槸涓€涓洿鎺ャ€佺畝媧佷笖鍙鐨凜/CUDA瀹炵幇銆俵lm.c鎬誨叡綰︽湁5000琛孋/CUDA浠g爜銆傛垜浠敖閲忎嬌鐢–鑰屼笉鏄疌++錛屼互淇濇寔綆€鍗曘€傜緇忕綉緇滆緇冨彧鏄竴涓獁hile寰幆錛屾墽琛岀浉鍚岀殑銆佺畝鍗曠殑綆楁湳榪愮畻錛堟瘮濡?銆?銆?銆?錛夊湪涓€涓誕鐐規(guī)暟緇勪笂錛屽疄闄呬笂涓嶅簲璇ラ偅涔堝鏉傘€?/span>

        • 瀹冪紪璇戝拰榪愯闈炲父蹇紙鍑犵閽燂級錛屾墍浠ヤ綘浼氳姳鏇村鏃墮棿鍦ㄨ皟璇曚笂錛屽噺灝戠瓑寰呮椂闂淬€?/span>

        • 瀹冨湪寮€濮嬫椂涓€嬈℃€у垎閰嶆墍鏈夌殑GPU鍐呭瓨錛屼粠閭d互鍚庡湪璁粌榪囩▼涓唴瀛樺崰鐢ㄤ繚鎸佸畬鍏ㄤ笉鍙樸€傛墍浠ヤ竴鏃︿綘寮€濮嬭緇冿紝浣犲氨鐭ラ亾鍦ㄦ暣涓繍琛岃繃紼嬩腑涓嶄細鍑虹幇鍐呭瓨涓嶈凍鐨勯棶棰樸€?/span>

        • 瀹冩槸 bit 綰х‘瀹氭€х殑銆?/span>

        • 瀹冪殑鏁堢巼寰堥珮錛屾帴榪憕50%鐨勬渶澶ф誕鐐規(guī)暟榪愮畻鍒╃敤鐜囷紙MFU錛夈€?/span>

        • 涓昏鍏ュ彛鐐瑰拰澶ч儴鍒嗕唬鐮佸湪鏂囦歡train_gpt2.cu涓€傝鏂囦歡鍖呭惈GPT-2妯″瀷瀹氫箟鍜岃緇冨驚鐜紝澶х害鏈?000琛屼唬鐮侊紝騫朵粠llmc鐩綍涓鍏ヤ簡璁稿鍖呭惈鍚勭宸ュ叿鍜屽悇灞傚疄鐜扮殑杈呭姪鏂囦歡銆俢loc llmc鎶ュ憡鏈?3涓枃浠訛紝鍏?170琛屼唬鐮侊紝鑰宑loc train_gpt2.cu鐩墠鏈?353琛屼唬鐮併€?/span>


        澶氳妭鐐硅緇?/strong>

        濡傛灉浣犲睘浜庢嫢鏈夊ぇ閲?GPU 鐨勪笂灞傞樁綰э紝llm.c 鏀寔澶氳妭鐐硅緇冦€?/span>

        Karpathy 鍒嗕韓閬擄紝鍏?/span>涓漢鐩墠鍋氳繃鐨勬渶澶ц妯¤緇冩槸鍦?Lambda 鐨勫叏鏂頒竴閿泦緹ゅ姛鑳戒笂錛岀敤 2 涓妭鐐圭殑 16XH100 GPU 榪涜鐨勩€傝繖鏄€屽け涓氱殑鍧忓涔嬩竴銆嶏紝姣曠珶娌℃湁閽變簡銆?/span>

        鍚屾椂錛屼粬榪樿閬擄紝Lambda 鍥㈤槦鎻愪緵浜嗚緇嗙殑璇存槑錛屾暀浣犲浣曞湪浠栦滑鐨勪竴閿泦緹や笂璁粌 llm.c妯″瀷銆備緥濡傦紝浣跨敤 512 涓?GPU 鐨?H100 闆嗙兢錛屾瘡灝忔椂璐圭敤涓?,300 緹庡厓錛屼綘鍙兘鍦ㄥぇ綰?0鍒嗛挓鍐呰緇冨ソGPT-2銆備綘闇€瑕佸鍔犳€繪壒閲忓ぇ灝忥紙渚嬪鍒扮害800涓囷級錛屽彲鑳借繕闇€瑕佺◢寰皟鏁磋秴鍙傛暟銆備笉榪囷紝Karpathy鑷繁娌℃湁灝濊瘯榪囷紝浣嗗叾琛ㄧず錛屻€屽畠鍙兘鍙錛岃€屼笖浼氶潪甯擱叿銆嶃€?/span>


        PyTorch 姣旇緝

        Karpathy 璁や負錛屼嬌鐢ㄥ叾騫惰 PyTorch 瀹炵幇錛屽湪 PyTorch 涓繘琛岀浉瀵瑰彲姣旂殑榪愯搴旇鏄繖鏍風殑錛?/span>

        • torchrun --standalone --nproc_per_node=8 train_gpt2.py \ --input_bin "dev/data/edu_fineweb100B/edu_fineweb_train_*.bin" \ --input_val_bin "dev/data/edu_fineweb100B/edu_fineweb_val_*.bin" \ --write_tensors 0 \ --model d48 \ --batch_size 8 --sequence_length 1024 --total_batch_size 1048576 \ --dtype bfloat16 \ --compile 1 \ --tensorcores 1 \ --flash 1 \ --num_iterations 32000 \ --warmup_iters 700 \ --weight_decay 0.1 \ --overfit_single_batch 0 \ --learning_rate 0.0006 \ --zero_stage 1


          PyTorch 浠g爜浠呬緵嫻嬭瘯鍙傝€冿紝鑰岄潪瀹為檯瀹炵幇錛屽洜姝よ緇冨驚鐜湪鏌愪簺鍦版柟鐣ユ湁涓嶅悓錛堜緥濡傦紝鏁版嵁鍔犺澆鍣ㄤ笉浼氬鍒嗙墖榪涜緗崲絳夛級錛屼絾榪欎粛鍙兘浣滀負鍙傝€冪偣鏈夌敤銆備粬榪樺皢榛樿璇嶆眹澶у皬淇敼涓?50257 -> 50304 浠ユ彁楂樻晥鐜囷紝鐒跺悗褰撳墠鐨?PyTorch 澶滈棿鐗堟湰緇欏嚭錛?/span>

          • step 16/32000 | train loss 8.903997 | norm 8.3474 | lr 1.37e-05 | (3381.88 ms | 310057 tok/s)step 17/32000 | train loss 8.870140 | norm 3.7936 | lr 1.46e-05 | (3381.95 ms | 310051 tok/s)step 18/32000 | train loss 8.875732 | norm 9.4993 | lr 1.54e-05 | (3393.09 ms | 309033 tok/s)step 19/32000 | train loss 8.817432 | norm 2.8345 | lr 1.63e-05 | (3379.75 ms | 310253 tok/s)step 20/32000 | train loss 8.798056 | norm 4.1234 | lr 1.71e-05 | (3386.53 ms | 309631 tok/s)step 21/32000 | train loss 8.777574 | norm 2.8010 | lr 1.80e-05 | (3386.05 ms | 309675 tok/s)...

            鐜板湪錛孉ndrej Karpathy琛ㄧず錛屸€滀笉鑳借鎴戝畬鍏ㄦ湁淇″績 PyTorch 鑴氭湰宸插緱鍒版渶澶х▼搴︾殑璋冩暣錛屼絾鍙互榪涜浠ヤ笅瑙傚療鈥濄€?/span>

            PyTorch 浼間箮鍗犵敤浜嗘洿澶氬唴瀛橈紙姝ゆ榪愯綰︿負 80GB錛夛紝鑰?llm.c 鍗犵敤浜?57GB錛堟彁楂樹簡 29%錛夈€傚唴瀛樺緢閲嶈錛屽洜涓哄畠鍏佽浣犲鍔犳壒澶勭悊澶у皬錛堜緥濡傦紝llm.c 鍦ㄦ澶勫彲浠ュ鍔犲埌 24 涓井鎵瑰鐞嗭級錛岃繖鏍烽€熷害浼氭洿蹇竴浜涖€?/span>

            鍏舵錛宲ytorch 姣忔榪唬澶х害涓?3386 姣錛岃€?llm.c 鍒欎負 2750 姣錛屽洜姝?llm.c 鐨勯€熷害鎻愰珮浜嗙害 19%銆備竴浜涙€ц兘鎻愬崌鐨勫師鍥犳槸宸茬煡鐨勶紝渚嬪 llm.c 鍖呭惈浜嗗儚鍚姩鍙嶅悜浼犳挱鐨?Fused classifier 涔嬬被鐨勪紭鍖栵紝鑰?Andrej Karpathy 閫忛湶錛宼orch.compile 鐩墠騫舵湭瀹炵幇榪欎竴鐐廣€備絾涔熸湁鍙兘榪欎釜鑴氭湰灝氭湭瀹屽叏璋冧紭錛屼笉榪囨棤璁哄浣曪紝Andrej Karpathy 灞曠ず榪欎釜瀵規(guī)瘮鏄負浜嗭細

            1) 璁╁叾浠栦漢鍙互鏌ョ湅銆佽瘯鐢ㄣ€佹瘮杈冨拰甯姪璋冧紭錛?/span>

            2) 琛ㄦ槑llm.c鍦℅PT-2/3璁粌鐨勭壒瀹氭儏鍐典笅宸茬粡鐩稿綋浼樺寲鍜屽揩閫熴€?/span>


            鏈€緇堟ā鍨?/span>

            • main.log 鏂囦歡錛坔ttp://llmc.s3-us-west-2.amazonaws.com/gpt2_1558M/main.log錛夈€?/span>

            • model_00032000.bin llm.c bin 妯″瀷鏂囦歡錛坔ttp://llmc.s3-us-west-2.amazonaws.com/gpt2_1558M/model_00032000.bin錛?/span>

            • 杞崲涓?huggingface transformers GPT-2 妯″瀷鐨勬ā鍨嬶紝宸蹭笂浼犲埌浜嗚繖閲岋細karpathy/gpt2_1558M_final2_hf錛坔ttps://huggingface.co/karpathy/gpt2_1558M_final2_hf錛夈€?/span>

            • 鐜板湪榪樻坊鍔犱簡涓€涓粡榪?100k 錛坔ttps://huggingface.co/karpathy/gpt2_1558M_final3_hf錛夋璁粌鐨勬ā鍨嬬増鏈紝璇ユā鍨嬬殑 HellaSwag 鍊間負 57.7錛岃€岀粡榪?330K錛坔ttps://huggingface.co/karpathy/gpt2_1558M_final4_hf錛夋璁粌鐨勬ā鍨嬬殑 HellaSwag 鍊間負 62.7銆?/span>


            妯″瀷瀵煎嚭

            渚嬪錛屾ā鍨嬪鍑哄彲鎸夊涓嬫柟寮忚繘琛岋細

            python dev/eval/export_hf.py --input log_gpt2_128M/model_00032000.bin --output gpt2_1558M_export

            鐒跺悗錛屼綘灝卞彲浠ヨ繍琛?Eleuther 璇勪及宸ュ叿錛屾垨鑰呰繍琛?huggingface 閲囨牱綆¢亾鏉ヨ幏鍙栨ā鍨嬫牱鏈細

            • # take model for spinimport torch
              output="./gpt2_1558M_final2_hf"
              # set pytorch seedstorch.manual_seed(42)torch.cuda.manual_seed(42)
              prompt="In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English."from transformers import AutoModelForCausalLM, AutoTokenizertokenizer=AutoTokenizer.from_pretrained(output)model=AutoModelForCausalLM.from_pretrained(output, attn_implementation="flash_attention_2", torch_dtype=torch.bfloat16, device_map='cuda')model.eval()tokens=tokenizer.encode(prompt, return_tensors="pt")tokens=tokens.to('cuda')
              output=model.generate(tokens, max_new_tokens=500, pad_token_id=tokenizer.eos_token_id, do_sample=True, top_k=50, num_return_sequences=4)samples=tokenizer.batch_decode(output)for sample in samples: print('-'*30) print(sample)


              400B token 榪愯

              闄ゆ涔嬪錛孉ndrej Karpathy 榪樺皾璇曠敤榪滆秴 33B token 鐨勮妯℃潵璁粌 GPT-2銆傚叿浣撹€岃█錛屼粬灝?-x 鏀逛負 400,000錛屼互璁粌 420B token錛堣妯$敋鑷蟲瘮浣跨敤 300B 璁粌鐨?GPT-3 妯″瀷榪樿澶э級銆?/span>

              榪欎釜妯″瀷鍦ㄨ繍琛屽埌絎?330,000 姝ラ涔嬪墠涓€鐩村緢濂斤細

              璇ユā鍨嬪湪 HellaSwag 涓婂ぇ澶ц秴瓚婁簡鍚岀瓑澶у皬鐨?GPT-2 鍜?GPT-3錛堟渶楂樺彲杈劇害 61%錛夛紝浣嗛仐鎲劇殑鏄紝浠庨偅鏃惰搗瀹冨氨鍙樺緱涓嶇ǔ瀹氬茍鍑虹幇浜嗛棶棰樸€?/span>

              鍦ㄨ繖涓繃紼嬩腑錛屾湁鏇村杈冨皬鐨勫嘲鍊鹼紝浣?Karpathy 灝嗕唬鐮侀厤緗敼涓哄綋媯€嫻嬪埌鐬椂涓嶇ǔ瀹氭椂璺寵繃鏇存柊錛堝叾涓嬌鐢ㄤ簡 -sl 5.0 -sg 5.0 鏍囪錛夛紝榪欐湁鍔╀簬緙撹В鍜屾帹榪熼棶棰樸€?/span>

              瀵規(guī)錛?span style='font-family: -apple-system-font, system-ui, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 1px;text-align: start;text-wrap: wrap;background-color: rgb(255, 255, 255);'>Karpathy 璁や負鍏跺鍒濆鍖栥€佹縺媧昏寖鍥村拰鏁翠綋妯″瀷璁粌紼沖畾鎬ц繕涓嶅璋ㄦ厧錛屽茍涓斿瓨鍦ㄦ洿娣卞眰嬈$殑闂錛岃繖浜涢棶棰樹細閫愭笎浣挎ā鍨嬮櫡鍏ヤ笉紼沖畾鐘舵€侊紝灝ゅ叾鏄浜庤緝澶х殑妯″瀷鍜岄暱鏃墮棿鐨勮緇冦€傝繖涔熸槸浠栦滑鏈潵鎯寵榪涗竴姝ョ爺絀跺拰鎺㈣鐨勫湴鏂廣€?/span>

              浠ヤ笂錛屼究鏄?Karpathy 姝ゆ瀹炶返鐨勬暣涓粡榪囥€?/span>

              浜哄伐鏅鴻兘璁粌涓嶄細瓚婃潵瓚婁究瀹?/span>

              涓嶈繃錛屼篃鏈変漢璁や負紜歡銆佽蔣浠跺拰璁粌鏁版嵁鐨勮繘姝ュ茍涓嶆剰鍛崇潃灝栫鐨?AI 璁粌浼氳秺鏉ヨ秺渚垮疁銆?/span>

              Anthropic 鍏徃棣栧腑鎵ц瀹?Dario Amodei 琛ㄧず錛岀洰鍓嶆鍦ㄨ緇冪殑浜哄伐鏅鴻兘妯″瀷宸茬粡鑰楄祫 10 浜跨編鍏冿紝鑰屾洿鏄傝吹鐨勬ā鍨嬪湪 2025 騫村氨浼氳揪鍒?1000 浜跨編鍏冦€?/span>

              榪欐槸鍥犱負铏界劧紜歡鎬ц兘瓚婃潵瓚婂己澶э紝浣嗕環(huán)鏍間篃瓚婃潵瓚婃槀璐點€備緥濡傦紝NVIDIA H100 鐩墠姣忓彴鍞環(huán) 4 涓囩編鍏冦€傚敖綆″姝わ紝涓嬩竴浠?Blackwell AI 鑺墖鐨勫敭浠烽璁″皢杈懼埌 7 涓囩編鍏冿紝闄ら潪鎴戜滑鑳芥壘鍒板儚 Sohu AI 鑺墖錛堜笓涓哄彉鍘嬪櫒璁捐鐨?ASIC錛夎繖鏍風殑紜歡紿佺牬錛屽惁鍒欎竴涓畬鏁寸殑鏈嶅姟鍣ㄦ満鏋剁殑鍞環(huán)灝嗚揪鍒?300 涓囩編鍏冪敋鑷蟲洿楂樸€?/span>

              闄や簡鎴愭湰鏂歸潰鐨勫獎鍝嶏紝AI 鏁版嵁涓績鏃ョ泭澧為暱鐨勭數(shù)鍔涢渶姹備篃寮€濮嬪紩璧蜂竴浜涗笓瀹剁殑鍏蟲敞銆備粎涓€鍧?H100 鑺墖錛屼互騫沖潎 61% 鐨勫勾鍒╃敤鐜囪繍琛岋紝姣忓勾灝變細娑堣€?3.7 鍏嗙摝鏃剁殑鐢?shù)鍔涖€備粎浠ュ鉤鍧?61% 鐨勫勾鍒╃敤鐜囪繍琛岀殑涓€涓?H100 鑺墖姣忓勾灝辮娑堣€?3.7 鍏嗙摝鏃剁殑鐢?shù)鍔涖€傚幓騫達紝Nvidia 鍜屽叾浠栨墍鏈夊弬涓庤€呭叡鍞嚭瓚呰繃 380 涓囧彴 AI GPU錛岀浉褰撲簬姣忓勾 14.3 TWh 鐨勭數(shù)鍔涳紝瓚充互涓?130 涓囦釜鏅€氱編鍥藉搴緵鐢點€?/span>

              浣嗗嵆浣挎姇鍏ュぇ閲忚祫閲戝拰綺懼姏鍦?AI 涓婏紝璋鋒瓕 DeepMind 棣栧腑鎵ц瀹樿〃紺猴紝鐩墠鐨勬ā鍨嬩粛鐒跺彧澶勪簬鐚殑鏅哄晢姘村鉤銆傚洜姝わ紝鎴戜滑浠嶇劧闇€瑕佷負鏈潵鐨勬ā鍨嬪啀鎶曡祫鏁板崄浜跨編鍏冦€備絾鏄紝濡傛灉浣犳兂灝濊瘯浣跨敤鏃фā鍨嬫瀯寤鴻嚜宸辯殑 LLM錛岄€氳繃 Karpathy鐨勬柟娉曪紝鍙鍑犵櫨緹庡厓?yōu)澶熶簡銆?/span>

              鏉ユ簮錛?/span>

              https://github.com/karpathy/llm.c/discussions/677

              https://www.tomshardware.com/tech-industry/artificial-intelligence/former-tesla-ai-director-reproduces-gpt-2-in-24-hours-for-only-672

              鐐庣値澶忔棩馃敟錛孉I 縐戞妧澶ф湰钀ラ€佹竻鍑夌鍒╋紒

網(wǎng)站首頁   |    關于我們   |    公司新聞   |    產(chǎn)品方案   |    用戶案例   |    售后服務   |    合作伙伴   |    人才招聘   |   

友情鏈接: 餐飲加盟

地址:北京市海淀區(qū)    電話:010-     郵箱:@126.com

備案號:冀ICP備2024067069號-3 北京科技有限公司版權所有