᳡ࡵ⛁㒓˖˄010˅88258888DŽ dbqq@phei.com.cnDŽ 䋼䞣ᡩ䆝䇋থ䚂ӊ㟇 zlts@phei.com.cn ˈⲫ⠜։ᴗВ䇋থ䚂ӊ㟇 䇋Ϣᴀ⼒থ㸠䚼㘨㋏ˈ㘨㋏ঞ䚂䌁⬉䆱˖˄010˅88254888DŽ ᠔䌁ф⬉ᄤᎹϮߎ⠜⼒к᳝㔎ᤳ䯂乬ˈ䇋䌁фкᑫ䇗ᤶDŽ㢹кᑫଂ㔎ˈ ܗ ॄ ᭄˖3500 ݠ ᅮӋ˖39.00 ॄ ˖2014 ᑈ 4 ᳜ 1 ॄࠋ ᓔ ᴀ˖720h1000 1/16 ॄᓴ˖13 ᄫ᭄˖162 गᄫ ࣫ҀᏖ⍋⎔ऎϛᇓ䏃 173 ֵㆅ 䚂㓪˖100036 ߎ⠜থ㸠˖⬉ᄤᎹϮߎ⠜⼒ 㺙 䅶˖ϝ⊇Ꮦⱛᑘ䏃䗮㺙䅶ॖ ॄ ࠋ˖࣫Ҁᅛ᯳ॄࠋॖ 䋷ӏ㓪䕥˖߬ 㟿 Ё⠜ᴀк佚 CIP ᭄Ḍᄫ˄ 2014˅ 043145 ো ĉ. ķĂ Ċ. ķIĂ ċ. ķӕϮㅵ⧚ˉֵᙃㅵ⧚ Č. ķF272.7 ISBN 978-7-121-22605-2 2014.4 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵ˋ IT ᶊᵘ䆒䅵ⷨお㒘㓪㨫 . ü࣫Ҁ˖⬉ᄤᎹϮߎ⠜⼒ˈ к⠜㓪Ⳃ˄CIP˅᭄ ⠜ᴗ᠔᳝ˈ։ᴗᖙおDŽ 㒣䆌ৃˈϡᕫҹӏԩᮍᓣࠊᡘ㺁ᴀкП䚼ߚܼ䚼ݙᆍDŽ ᴀк䗖ড়ϔᅮᶊᵘ⸔ᶊᵘ㒣偠ⱘҎ䯙䇏DŽ кЁᡒࠄⳌ݇ⱘᶊᵘ㒣偠ˈᇍᙼҞৢⱘᶊᵘ䆒䅵ᎹЁ䛑㛑䍋ࠄᕜདⱘᐂࡽ⫼DŽ 䖯ⱘᶊᵘDŽ᮴䆎Դህ㘠Ѣાϔ㸠Ϯ䛑ৃҹҢᴀܜⱘḜ՟ᵕ݊ᅲ⫼ˈҷ㸼њ䆹乚ඳ䕗 Ḝ՟ሩᓔϢᶊᵘⳌ݇ⱘ䅼䆎DŽᴀк㗙ᴹ㞾Ѧ㘨㔥ǃᬭ㚆ǃӴ㒳㸠Ϯㄝ乚ඳˈߚѿ ᴀкҹ᭄ᯊҷЎ㚠᱃ˈ䙔䇋㨫ৡӕϮЁⱘϔ㒓ᶊᵘᏜˈ㒧ড়ᎹЁⱘᅲ䰙 ݙ ᆍ ㅔ ҟ ေፇ౧ࠀ˧ၹਗ਼ᄊᛡ˞˸ਅὊ˞ႃηᤂᖹࠄဘድጺӑܫေὊཀྵՑಪ ܫ᜶රၹਗ਼ᄊʽᎪெঃᝮैॹᮌᜂδߛὊᏫ˅ᤇᭊ᜶ᤉᛡѬౢଈ ὊᤈࡃڂৌᒰТ᧘᜶ὊᏫԡ௧͜ፒᄊ CDR ភӭѬౢਫ਼ˀᑟଢΙᄊǍ ၹਗ਼ᄊʽᎪᛡ˞˗ᘔեᅌܸ᧚ᄊࠇਗ਼ྲढ़֗ࠇਗ਼ᭊරηৌὊᤈ̏η ዝὊ̰Ꮻᒱࠫࠇਗ਼ᄊေᝍదᣗܸϠࣀὊᖹᩙ౧ˀΈǍ Ѭౢ˗ᜂឨॆ˞ʷښ˔௧ҫိᭊ᜶ᄊࢺͻႃភ߹ЛˀՏᄊː˔̡ड़ड़ ႃភὊࠄᬅᤰភᄬᄊԣၷำவरХ˗ʷ˔௧௹ʽˁనԤᐊܹὊԳʷ ႀ̆ขᅼ௴ᤰភЯࠔὊː˔ᤰភᛡ˞വरዝͫΓݠܴᫎ᫂ᫎᄊ᫂ ԪएॹཀྵॢࣀὊ๎ᠠܸ᧚ᠫູὊࣳ˅ขԩ४ᓢݞᄊ౧ǍᤈመѬౢὊ ᄊࠇਗ਼Ὂݠ౧࠲˨ՏࠫॠὊࠇਗ਼ᄊଌی˞ᄱͫᄊ̡Իᑟ௧߹ЛˀՏዝ ࠇਗ਼ᛡ˞ѬౢԻᑟᎥܸܿ̀᧚ᄊࠇਗ਼ᛡ˞దηৌǍΓݠὊː˔ᤰភᛡ ᜂˏफǍᄬҒὊႃηᤂᖹԶᑟ۳̆ CDRហጺᤰភᝮै˞˟ᄊڂԔ ߛϲቇᫎ˞ڂʽᎪெঃႀ̆᧚ࢽܸὊ̗ၷՑԶᑟᜂδ႑ 3 ܹὊࡃ ረү̉ᐏᎪݠԣᄊ̭ܹὊඈܹ̗͘ၷܸ᧚ᄊʽᎪெঃὊᤈ̏ښ ေᄊဘ࿄ܫʷnjႃηᤂᖹʽᎪெঃ ᮍᓎ ေ˗ᄊऄၹܫʽᎪெঃ ႃηᤂᖹښBEPPQ ష( N N Y ॷጸ͈ HBase HDFS ܱᦊູ ѵ ৌ ๗ Ꭺᮆः segment ࣃѬዝ ေ ࣃઅԩܫᮕ આᇾ ѬዝѬឈᎪᮆ URL ः ఞழ ళઅԩ ࿄গ segment ࣃઅԩ URL ः ଢԩ ళઅԩ URL ྔԩ Ꭺᮆ ಖᝮ ጼൣ ေܫൣ ͊Ҭጼ Ꭺᮆः ࣃѬዝ Y URL ः ళઅԩ උࠫ ԩ URL ళઅ ଢԩ URL ͈ URL ˚ උࠫ URL Ѭዝ ۳ю URL ெঃ ၹਗ਼ ਫ਼ᇨǍڏЦʹืሮݠʾ ˔Ղᆊᄊ˔̡؞ݞὊ˞ድюᖹᩙଢΙΚǍ ፒᝠѣඈیᄊᎪᮆѬዝὊѾၹവڧࠫऄᎪ֗ڧ4Ὄಪඈ˔̡᫈Ꭺ юᆸভǍ ὊଢᰴᎪᮆѬዝᄊیὊᤉᛡᎪᮆѬዝὊˀல͖ӑവیોིᎪᮆѬዝവ ὊᯫЏྔԩᎪᮆὊཀྵՑࠫྔԩᄊᎪᮆڧڡ 3Ὄࠫళᅼᄊ URL 2Ὄࠫࣃᅼᄊ URL Ὂોི۳ю URL ѬዝюѷᤉᛡѬዝǍ ᤉᛡଢԩǍڧڡ 1ὌࠫʽᎪெঃᄊ URL ေவขᄊืሮݠʾܫʽᎪெঃ ေவขᄊืሮܫ̄njʽᎪெঃ ေҪᑟǍܫଢΙଈ ᬤᅌ Hadoop షࣱԼጇፒᄊѣဘὊԻ̿ࠄဘʽᎪெঃᄊߛϲὊՏ ᤂᖹଢΙ᧘᜶ᄊᖹᩙΚǍ 3 Hadoop ᡔᴃ⬉ֵ䖤㧹ଚϞ㔥᮹ᖫ໘⧚Ёⱘᑨ⫼ᶊᵘ ᄊὊԻ̿ᤰ FTP வیӊ᧔ᬷ֗ҫᣒሮὊՏ˷ஃે͈ዝ ःὊݠ OraclenjDB2 ᄊ̔૱ὊیଌԻࠄဘࠫТጇ ጇፒଢΙ᫈ଌǍڊᄊ᧔ᬷὊ̉ᐏᎪᎪᮆЯࠔᄊྔԩܱ֗ࠫ ଌࡏ᠇᠊ˁܱᦊጇፒᄊᤉᛡ̔૱Ὂӊၹਗ਼njʽᎪெঃ ষሖ ᣥК҂ᬷᏆ HDFS ͈ጇፒ֗ HBase ः˗Ǎ ̰ႃηᤂᖹጇፒҬ٨ࠀᖍԩၹਗ਼۳వηৌ֗ʽᎪெঃηৌὊ ᭄⑤ ᧫ࠫඈʷᦊѬᄊЦʹҪᑟ̮ፁݠʾǍ ਫ਼ᇨǍڏေጇፒᄊᣤவವݠʾܫေืሮὊʽᎪெঃܫ۳̆ʽᤘ ʼnjʽᎪெঃጇፒᄊషவವ 4 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵 ӑ᧘ጸὊѳѬᒰࠫऄᄊࡏፇὊݠᇫ̔-ᇫӝڱ࠲Х˗ᄊዝѿᤉᛡവ ฌв URL ԣХዝѿᄊྔԩὊࠫᤉᛡፒʷኮေὊࣳᎶ̆ેˤӑߛϲ˗Ǎ ᠇᠊ऄၹҪᑟᄊЦʹካขࠄဘǍࠄဘ̀ᎪᮆѬዝጊळὊᤰ̉ᐏᎪ ᑨ⫼ሖ ូएǍ ͈ᤉᛡˀՏᄊืሮืᣁǍ᧔ၹप൦ូएኖ႕Ὂᑟܵஃેܸࣳԧ᧚ᄊ ѬஃᄊৱцʾᤰܳښႀѼலᑟҧὊᑟܵܬᫎnj᧘តὊՏЦ ܵᤉᛡ͊ਓ˙ࣳᐏᄊืሮូएὊࣳ˅ᑟܵ҄ᓬགᄊ͖Џጟnjᡔ ေᑟܫ҄Ὂࣳ᠇᠊ Hadoop ᬷᏆᄊᤂᛡኮေ֗ጇፒઑெঃኮေǍ ေὊ ଢΙᬷᏆᄊ᫈ܫӑڱᄊവڱေืሮവܫҪᑟࡏࠄဘ̀ ࡳ㛑ሖ Տᝈᓤᤉᛡ༧ำᦊᎸǍ ᠫູࡏࠄဘ̀ࠫྭေᠫູᄊᒭүᦊᎸ֗үগੱ࡙ὊࠫѬ࣋रᬷᏆ˗ˀ ڂፒࣱԼࡏଢΙᤉʷ൦ᄊચ៶Ὂ̿ଢΙᒭүӑᦊᎸ֗ुভᄊᤂ፥ᑟҧὊ एὊᭊ᜶ࠫྭေᠫູࡏ֗ጇాܭႀ̆Ѭ࣋रࣜᄊᆶ͈ᦊᎸᄊ 䌘⑤ሖ ሮǍ ϲ˗ԝǍᤰѬ࣋रᝠካԻ̿ࠄဘᄊຍฤnjᣁ૱njಣᰎ֗ᜉᣒ Ὂ࠲ҫᣒ҂Ѭ࣋रߛیຍฤὊతጼોིᮕЏࠀ˧ݞᄊവ ေࣱԼὊູ̰ચԩѣਫ਼ᭊᄊὊፃܫࡏ௧Ѭ࣋रܸ ᭄ሖ ኮေ֗߷Лভྲढ़Ǎ रᤉᛡ᧔ᬷǍጇፒܱࠫଢΙፒʷ᫈ଌὊЦదनஊভnjᰴভᑟnjԻᄣ 5 Hadoop ᡔᴃ⬉ֵ䖤㧹ଚϞ㔥᮹ᖫ໘⧚Ёⱘᑨ⫼ᶊᵘ ਫ਼ᇨǍڏ੨ݠʾ ေጇፒᄊྭေᎪፏᦊᎸܫᤉᛡ᫈҄ǍʽᎪெঃܗ˨ᫎᭊ᜶ၹ༢ ေᝠʽὊᭊ᜶ᝠː˔߹டᄊЯᦊᬷᏆᎪፏὊᬷᏆᎪፏྭښ Ὂ߲̓˨ᫎᄊ᫈ᭊ᜶ˑಫ҄Ὂ̿δᎪፏᦊᎸ߷ЛǍڂؓὊ ေଐஷˀܵ߹ܫ֗ႃηᤂᖹЯᦊᎪፏ̉ᐏὊᏫ˅ Hadoop ᬷᏆᄊ߷Л ေ Hadoop ᬷᏆ௧ܫႀ̆Ꭺፏྔԩืሮᭊ᜶̉ᐏᎪᠫູஃેὊ ᄊЯࠔߛϲ҂ HDFS ͈ጇፒ˗Ǎڧ ڡ Ὂతጼંઅԩ҂ᄊ URLڧڡ MapReduce ᄊவरࣳԧઅԩ̗ၷᄊ URL ὊཀྵՑᤰڧڡ ˗Ὂྔᙂ͘ಪʷࠀᄊѷ̗ၷ᜶અԩᄊ URL ःڧڡКྔᙂᄊڧڡ ԝǍҜʾᄊ URLڧڡ URLnjࣃፃѬዝᄊ URL ᄊ URLnjࣃፃઅԩᄊܭὙԝ᧘͘ં᧘ڧڡ ᮠnjᣄ͈Яࠔᄊ URL njྟڏὊࣳᤉᛡnjԝ᧘୲ͻǍХ˗୲ͻԝᬔڧڡ ͈˗ଢԩ URL መોིʷࠀᄊѷὊᒭүઅԩʺ፥ᎪηৌᄊሮऀੋᏨᑮవǍሮऀ̰ெঃ ေืሮ௧ʷܫʹ᠇᠊̰̉ᐏᎪጇፒ˗ྔԩᎪᮆᄊЦʹЯࠔηৌǍЦ 㔥㒰⠀㰿 ေፇ౧ᤉᛡүগ࡙ᇨǍܫေҪᑟὊࣳࠫܫྀጷΎၹՊመऄၹ ေፇ౧ᤰ Web ᮆ᭧࡙ᇨὊࣳ˅ଢἹ̉ᮆ᭧Ὂܫ᠇᠊࠲ऄၹҪᑟ ሩ⼎ሖ ፒʷኮေǍڧڡ ̿ΧஃᖹᩙำүὊࠄဘ̀ URL ХϠݞྲढ़ὊಪЯࠔϠݞྲढ़ᤉᛡࠇਗ਼ጺѬὊࣳஃેᄬಖࠇਗ਼ᏆଢԩὊ ழὊˀல߹ؓǍࠄဘ̀ၹਗ਼ᛡ˞ፒʷѬౢὊ۳̆ࠇਗ਼ᄊ᫈ᛡ˞Ὂគѿ བྷ᫃ឈලԣၹឈලᄊྔԩὊಪਫ਼࡛ዝѿथѬឈឈःǍឈःࠀరఞ ੋᇫ̔ -ॲӰὊࠫዝѿᤉᛡጊळǍࠄဘ̀ឈःѬዝኮေὊᤰࠫᎪፏ 6 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵 ेᬷᏆ˗ᄊҬ٨ѣဘᩲឨὊட˔ᝠካሮࣳˀ͘ጼൣὊՏѬ࣋र ေǍܫ˗ᤂᛡᄊܳ˔ᓬགښᄊϋὊᤰѬ࣋रᝠካ࠲Х͊ҬѬᝍࣳ ေܸܫښေܸᬷᄊᣄ͈Ǎܫ2ὌHadoop షவವ᧔ၹࣳᛡ License ᄊੇవὊదᎁᝍ̀ጇፒੱࠔࣜᄊᰴੇవԍҧǍ ᄊᤰ PC Ҭ٨ʽὊܸܸᬌͰ̀Ҭ٨֗ߛϲᄊੇవὊ̿ԣः ᧔ၹ X86 ښѬ࣋रጇፒదᅌᰴࠔᩲভᄊྲགὊࣳ˅ᝠၹᤂᛡ Ѭ࣋रߛϲ֗Ѭ࣋रᝠካጇፒǍ ॷషúúѬ࣋र͈ጇፒˁѬ࣋रᝠካὊ थ̀ʷடݓ߹டᄊ 1ὌHadoop షவವ௧۳̆Ѭ࣋र۳ᆩὊЍѬѾၹѬ࣋रːܸ ʽᎪெঃጇፒ᧔ၹ Hadoop షᝍхவವᄊ͖ҹదݠʾїགǍ njʽᎪெঃጇፒவವᄊ͖ҹپ ⠀㰿㔥㒰㋏㒳 Ϟ㔥᮹ᖫ໘⧚㋏㒳 ষᴎ Gnষ᮹ᖫ ষᴎ Ϟ㔥᮹ᖫ ষᴎ Ϟ㔥᮹ᖫ Clients Internal ` Clients Internal ` ... Web Server Web Server Hmaster Hmaster Node Node Name Name Node Data Node Data Node Data Node Data Node Data Traker Traker Job Job Switch Aggregation Internal Ѧ㘨㔥 ` 7 Hadoop ᡔᴃ⬉ֵ䖤㧹ଚϞ㔥᮹ᖫ໘⧚Ёⱘᑨ⫼ᶊᵘ ࠓࢅֺ̣dܤཙࡎēဍୣಾݮဟ Windows ༔થ Đပಇదԅٲc༔થᅽੋޯۦcࠓࢅֺ̣ď୶ჼٲޯܤ༔થރྻ ēճ୶Ӊ҅ܙปރࡂԅϦ೧ܬಓСޝᅖఉdտұϵူӖಬމغ ޏēฑనసࠝ MVPē੶ᄉּԙС٤ഓᆇ༿ࢳڳ㗙ҟ㒡˖ֺߙ ῚूదҧᄊஃǍ ေጇፒὊ˞សНՃᄊၹਗ਼ድюᖹᩙଢܫᖹᤂᛡࣱԼʽᦊᎸ̀ʽᎪெঃ ౽ᄵጟᤂښڡὊࣃፃੇҪڄᄊ߹டᝍхவವ֗НՃूܸᄊካขஃે ܸ̈НՃὊ̵̓Κ੬ᒭ˟ᆑԧᄊ BDP ࣱԼᣄ͈ӊե Hadoop ࣱԼ ေጇፒોིʽᤘὊଢѣ̀߹டᄊᝍхவವǍХ௧ܹܫࠫʽᎪெঃ ЯὊϸܹܸ̈njӨ˞nj̎ηܳࠒᅼՐᄊܸ͍ˊᦐ᧫ڎښᄬҒ ঌᤴnjᰴᤂᛡǍی̿ΎᎪፏྔᙂnjᎪᮆѬዝ֗ʽᎪᛡ˞വ ட˔ᬷᏆ˗ԧၷᬪᩲឨᄊзǍᤈመᝠவವԻښጇፒԻδᬪ 8 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵 ቤ̂֗ঌᤴԧ࡙Ǎ Л᧚ˊҬὊขᆸંଧᩐᛡˊԧ࡙வՔὊ̰ᏫˀѾ̆ᩐᛡ͍ˊ ਓएὊᒱࠇਗ਼ืܿὙ 3Ὄႀܸ̆᧚ሏጳߛϲὊᩐᛡ͍ˊขѬౢ 2Ὄܸ᧚ሏጳߛϲὊᒱࠇਗ਼ขঌᤴᖍ४̔ηৌὊᬌͰࠇਗ਼໘ ҫ٨ভᑟ֗ߛϲቇᫎὊᄰଌҫܸ̀ॷጇፒᤂᖹ፥ઐੇవὙܙ1Ὄ ေᑟҧ Ǎͮ௧᧔ԩ̿ʽᄊኖ႕͘ᒱ̿ʾˀᡜܫ̰ᏫଢᰴˊҬ ԋԾὊѓ࠶ॷጇፒᄊߛϲ᧚ὊѓᣐॷጇፒᄊԍҧὊ͋ܬ2Ὄ ေᑟҧὙܫҫॷጇፒᄊ٨ভᑟ֗ߛϲቇᫎὊଢᰴˊҬܙ௧1Ὄ ὊᄬҒᚸᩐᛡˊ᧔ԩᄊ᥆ऄࠫኖ႕ڂःവरࣜࢽܸᄊԍҧǍ یःЏܹভˀᡜὊࢽܸᄊ᧚ࠫ͘͜ፒᄊТጇیႀ̆Тጇ ᄱТˊҬ᧚ফқʽӤὊᚸᩐᛡˊԁ࠲ᤉКܸ̽Ǎ ᫂Ὂܙঌᤴښ˷ܳழУᄊஃ̷வरˀல๙ဘὊᚸᩐᛡˊᄊηৌ᧚ ᩐᛡˊηৌӑࣃፃᤪຒԣὊͮ௧ᬤᅌ̉ᐏᎪష֗ऄၹᄊᮻᤴԧ࡙Ὂ ᚸᩐᛡˊᄊԧ࡙֗Ꭺፏᤰη۳ᆩஷඵࣱᄊଢᰴὊᚸڎᬤᅌੈ ʷnjᚸᩐᛡˊဘ࿄ 㭯ᔎᔺ ऄၹ ᚸᩐᛡˊᄊښBEPPQ ࣱԼ( ጇፒǍڊ҂ܱڀေፇ౧ᤅܫᄊڀં̰ॷጇፒᤅ ေὊՏܫ̔ὊཀྵՑಪ̔ᆊᄊˀՏὊᣁˀՏᄊॷጇፒᤉᛡ ጇፒᄊڊҒᎶˊҬጇፒˊҬ̔ᄊᣁὊ߲᠇᠊ଌஆᒭܱ ጇፒǍڊܱ˞Ի̿ॆکˊҬᄱТᄊጇፒὊ ጇፒ᠇᠊ᄰଌˁࠇਗ਼ᤉᛡ̔̉ὊଢΙˊҬҬὊਫ਼దˁᩐᛡڊܱ ਫ਼ᇨǍڏॷˊҬጇፒጸੇὊݠʾ ጇፒnjҒᎶˊҬጇፒ֗ڊᄬҒὊᩐᛡ͍ˊᄊˊҬ۳వᣤႀܱ Ѿၹ͉ϙǍ नԧᄱऄᄊካขࠫᤈ̏ᤉᛡଈѬౢὊଢᰴᩐᛡ͍ˊࠫԋԾᄊ ۳̆ Hadoop షᄊྲགὊԻ̿ၹ߲ߛϲᩐᛡˊᄊሏጳὊࣳ ᚸᩐᛡˊᄊऄၹښʼnj)BEPPQష х̵̓᭧˚ᄊܸ᫈ᮥǍ ऄܸࠫ̽ᄊ͖ҹӡѬ௭Ὂᡕᡕܳᄊ͍ˊ͘᧔ၹᤈመషᝍښ үগѬౢὊࣃፃ˞̵̓ࣜ̀ࢽܸᄊѾǍ᧔ၹ۳̆ Hadoop ష ᧔ၹ Hadoop షࠄဘ̀ࠃֶߛϲ֗̔ڄᄬҒὊᬁ᧗ࣅࣅᬷ ऄၹὊ߲Ի̿ࠄဘ๒᧚ᄊͰੇవߛϲnjᄊᰴᝠካ֗ѬౢǍ ᐏᎪᛡˊ֗ႃߕҬᛡˊ४҂ࣹ̀ฅᄊ̉ښHadoop షᄬҒࣃፃ ܸᄊଈऄၹ҂̀ʷ˔ழᄊᰴएǍ Ὂ࠲ی๒᧚ߛϲὊ߹ЛஃેѬ࣋रᝠካὊஃેᰴጟଈካขവ Hadoop ࣱԼ௧ࠫ͜ፒᄊᮬ֗᭩ழὊ߲Ի̿ࠄဘͰੇవᄊ ̄nj)BEPPQషᄊԧ࡙ဘ࿄ 10 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵 ˗ॷ ҒᎶˊҬጇፒ ̡ᛡ Ꭺᩐnjஃ̷njေ᠉ጇፒ Х̵ҒᎶጇፒ ҒᎶጇፒ ҒᎶ ႃភᩐᛡܬᒭҰ ᎪགҬ٨ ˊҬጇፒ ኄʼவ ॷˊҬጇፒ Ꭺᩐnjஃ̷njေ᠉ጇፒ Х̵ҒᎶጇፒ ҒᎶጇፒ ҒᎶ ႃភᩐᛡܬᒭҰ ᎪགҬ٨ ˊҬጇፒ ˗ॷ ҒᎶˊҬጇፒ ኄʼவ ̡ᛡ ࣱԼጇፒॷˊҬጇፒ Hadoop ਫ਼ᇨǍڏҫ Hadoop ࣱԼጇፒՑὊᩐᛡˊҬ۳వᣤݠʾܙ ေҪᑟǍܫଈ ὙܱࠫଢΙಊវҪᑟὙᤇԻ̿ಪߛϲྲགὊଢΙ͋ܬߛϲ ҫ Hadoop ࣱԼጇፒὊࠄဘॷጇፒᄊԋԾܙॷጇፒࡏὊښˀԫὙ ᩐᛡ͍ˊΎၹ Hadoop ࣱԼషᄊ۳వন௧δેԔጇፒ ေਫ਼ద̔ˊҬᄊЦʹࠄဘǍܫ᠊᠇ॷˊҬጇፒ 11 Hadoop ᑇৄ䞥㵡䫊㸠Ϯⱘᑨ⫼ᶊᵘ ਫ਼ᇨǍڏፇݠʾڱᤰ᧔ၹᄊҪᑟവ ःὊᤉᛡ᭤ፇӑᄊߛϲǍ अࡏὊѾၹ Hadoop ࣱԼጇፒὊᤉᛡܸവߛϲὊଢΙ HBase ေ֗ካขଈὊ̿Χၷੇ໘ᡜᭊරᄊՊመǍጇፒܫᤉᛡҫࢺ ὊࠫڱေҪᑟവܫေࡏὊ͘᧫ࠫˀՏᄊˊҬᭊරଢΙˀՏᄊˊҬܫᫎ ᤉᛡ࡙ᇨǍጇፒ˗ڱေՑᄊፇ౧ᤰ࡙ᇨവܫᦊጇፒᖍԩᠫູὊཀྵՑ࠲ Ὂ̰ܱڱ࡙ᇨവ֗ڱMVC ᄊവरᤉᛡᝠǍᯫЏጇፒʽࡏὊᤰଌവ Hadoop ࣱԼጇፒ˞̀໘ᡜᚸᮗ۫ҬᭊරὊጇፒЯᦊ᧔ၹ ᑟ࡙ᇨᭊ᜶Ǎ ҂ॷጇፒᄊ̲ःὊ̿Ι౽̏ઑ᛫Ҫڀေፇ౧ᤅܫጇፒὊ˷Ի̿࠲ ڊ҂ܱڀေፇ౧ὊᤰҒᎶˊҬጇፒὊᤅܫေὊࣳ࠲ܫࠫˊҬ̔ᤉᛡ Hadoop ࣱԼጇፒಪˊҬᭊරὊѾၹ̰ॷጇፒКᄊԋԾὊ ̿ࠄဘ౽̏ಊវˊҬᭊරǍ ᭊ᜶ᄊॷ҂ Hadoop ࣱԼጇፒ˗Ὂ͋ܬॷˊҬጇፒࠀ ጇፒǍڊ҂ܱڀေፇ౧ᤅܫေὊཀྵՑ࠲ܫᣁ҂ Hadoop ࣱԼጇፒ˗ᤉᛡ ጇፒᄊ౽̏ಊវˊҬڊҒᎶˊҬጇፒಪˀՏᄊˊҬ̽ᆊὊ࠲ܱ ጇፒˀԧၷˀӑǍڊܱ 12 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵 ਫ਼ᇨǍ ڏᬷᏆᄊവरᤉᛡᦊᎸὊᤰ᧔ၹᄊྭေᦊᎸፇݠʾܬ˟ጇፒ᧔ၹ ˀՏᄊᬷᏆ˗ᤉᛡᦊᎸὊښߛϲǍ ऄ࠲ Hadoop ࣱԼጇፒᣄ͈͋ܬ༫ ࠔڡጇፒॹᮌᝠपڂᚸᩐᛡˊࠫߛϲ߷Л᜶ර᭤ᰴὊ ေவरǍܫරὊଢΙˀՏᄊ࡙ᇨ ေՑᄊፇ౧ᤉᛡ Web ᮆ᭧࡙ᇨὊՏᤇ᜶ಪԔదጇፒᄊᭊܫࠫ ሩ⼎ഫ ߛϲኮေǍ͋ܬଢΙ HDFS ͈ጇፒὊଢΙܳҞవ ˟᜶Ҫᑟ௧ଢΙ HBase ःὊࠫ᭤ፇӑᤉᛡፒʷߛஊኮေὊ ᭄ഫ ေืሮǍܫӊեଈካขnjˊҬ˗ڱὊҪᑟऄၹവڱവ ေܫေᭊ᜶֗ጇፒᤂᛡᭊ᜶ଢΙࠫऄᄊҪᑟܫ˟᜶Ҫᑟ௧ಪˊҬ ࡳ㛑ᑨ⫼ഫ ေவขǍ ܫ˟᜶Ҫᑟ௧᧫ࠫˀՏᄊູ֗ಫरὊଢΙࠫऄᄊК ষഫ ູ௧̵̓ᄊॷˊҬǍ ᚸᩐᛡˊ˗Ὂᤈ̏ښေᄊູǍܫ˟᜶Ҫᑟ௧˞ጇፒଢΙҫࢺ ⑤᭄ഫ ᄊЦʹឭݠʾǍڱඈ˔Ҫᑟവ 13 Hadoop ᑇৄ䞥㵡䫊㸠Ϯⱘᑨ⫼ᶊᵘ ˊေࠄ̔ܫڡ᠍ՂԋԾ੪ӿಊវ Ҫᑟᄊ̔ Ὂᝨॷጇፒఞݞ 4ὌѾၹ Hadoop ࣱԼጇፒὊઞ̀ॷጇፒ౽̏๗Ᏺভ̔Γݠ ᝮैὊ˞ᄣኮࣟঅὊଢῚదҧᄊషஃǍ රὊᎄиଈካขὊѾၹ̔Ὂঌᤴࠀ͍ͯˊ᭤ขฤᨑᄊ̔ 3ὌЍѬѾၹ̀ Hadoop ࣱԼషᄊଈҪᑟǍԻ̿ಪˊҬᭊ ଢᰴ̀ࠇਗ਼ᄊ໘ਓएὊࠄဘ̀̿ࠇਗ਼˞˗ॷὊଢᰴ̀ᩐᛡᄊቤ̂ҧǍ ैὊඓሚጟଽጊፇ౧ὊԻ̿˞ၹਗ਼ࠄଢΙ͊͵̔ᫎᄊ̔Ὂ 2ὌЍѬѾၹ Hadoop ࣱԼష๒᧚ঌᤴଽጊҪᑟǍᄈʺ̣ᝮ Hadoop ࣱԼጇፒ˗Ὂࠄဘ๒᧚ߛϲǍ PB ጟᄊߛϲὊԻ̿ંᩐᛡˊҬ̗ၷᄊਫ਼దˊҬᦐߛϲ҂ 1ὌЍѬѾၹ Hadoop ࣱԼషᄊߛϲ͖ҹǍHadoop ࣱԼԻ̿ଢΙ ࠲ʽᤘவವळᤉᚸᩐᛡˊ˗Ὂ࠲ЍѬѾၹ̿ʾ͖ҹǍ nj)BEPPQషᄊ͖ҹپ 14 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵 ಝdٲޯރົં࠼स स࠼ིdંၽၩݮԙúúඟၩӖӤఉڕંົރಝ࠼ིٲֺ̣ޯ ೬ԅ֟ٝᆴēပտّԨၮcࠡకྜԅӖࠓࢅޏڑອ സݯᅥྜഋಶēЩҶಹӉ҅నߑcӖޙ㗙ҟ㒡˖ ༯ஜē ̽ᄊ҂ଢΙूదҧᄊషδǍ Ὂ Hadoop ࣱԼషॹ࠲ࠫᚸᩐᛡˊऄܸࠫڂऄಊវፇ౧Ǎ־ጟ ᛡᄊԋԾಊវጇፒ˗Ὂࠄဘ̀សᩐᛡਫ਼ద᠍Ղᄊ̔ԋԾᝮैඓሚ ᄬҒὊܹܸ̈НՃὊࣃፃ࠲ʽᤘᝍхவವੇҪऄၹ҂౽ᩐ ηৌጇፒᄊે፞Ϥकԧ࡙Ǎ ҬὊЍѬԧ͜ፒःᄊ͖ҹὊϢ҂͖ҹ̉ᛪὊ̰Ꮻδᚸᩐᛡˊ IT 15 Hadoop ᑇৄ䞥㵡䫊㸠Ϯⱘᑨ⫼ᶊᵘ ࣱԼᄊүͻӊեڮՏᄊǍᤈਓ֊ᅌࣱԼၹਗ਼ᬤԻᑟᤉᛡʷ̏ᆡ HDFS ʽߛదՊመὊదНၹᄊὊదࠛᄊὊˀՏᄊၹਗ਼Ի̿᫈ˀ ǍनஊԁС̚ Hadoop ᬷᏆڄˁ᭤షڄὊᤈХ˗ӊష ڄਫ਼់नஊὊࡃ௧ᑟܵनஊፌНՃ͊͵దᭊ᜶ࣱԼᄊХ̵ᐌᑟ ୄथनஊࣱԼ ̄˞ʷὊथʷ˔नஊ˅߷Лᄊ Hadoop ፒʷࣱԼǍ ܸߛϲˁషᄊஃેὊੈ̓ᭊ᜶࠲ː˔ Hadoop ᬷᏆՌڄࠫХ̵ ஃˊҬՌࣳὊ̿ԣڡՊᒭવదʷ˔උᣗܸᄊ Hadoop ᬷᏆǍ˞̀ఞݞ டՌ˨ҒὊːښ̓ࠫː˔ᮠᎪባᄊᤉᛡடՌnjˊҬᤉᛡடՌǍ ࠈ࣋ՌࣳὊᤈ˔ழᡜ̿ᝨˊႍෳᒑǍᬤ˨Ꮻᄊᭊ᜶ੈ៰ژெὊ͖ᦺ ൦ळК Hadoop షǍ2012 ࣲ 3 థ 12ᤪښ˷ڄHadoopὊНՃᄊХ̵ ΎၹښὊᡕᡕܳᄊᮗ۫ᦐ־ᅌ Hadoop షᄊࣹ̿ԣܸᄊॖ Ύၹ HadoopǍːࣲᫎᬤښڄࣲ˨ҒὊԶద͖ᦺᎪ 2011 ښொ ᑀఀ ٙᵄ ˨ BEPPQ ࣱԼनஊ( ៰ژᦺ͖ ˔ܱὊᬤᅌनஊᄊၹਗ਼ᡕᡕܳὊ˞̀வΧՑ፞ᄊੱ࡙Ὂੈ̓࠲Ѭᦡܳ ᤂᖹኮေʽੈ̓Զᭊ᜶Тฌї˔Т᪄ᄊᓬགǍښ˗ᄊ͊͵ʷ˔ᓬགὊᤈನ ᄊ୲ͻԶᑟᤰࣱԼଢΙᄊࠇਗ਼ቫ٨ੰᛡὊၹਗ਼ˀᑟᄰଌ᫈ᬷᏆ ҬǍၹਗ਼ࠫᬷᏆܭJobTracker ੋᏨ NameNode ѣဘᬪԻ̿ঌᤴূ ၹᄊᝈᓤὊेܬNameNode ᦊᎸὊܱὊSecondNameNode ᤇЍे̀ NameNode ˁ JobTracker ѬनᦊᎸὊSecondNameNode ˷࿘ቡ̆ Ǎڏʷ˔උᣗ߷Лᄊ Hadoop ᬷᏆᄊ੨ፇ˞ڏᭊ᜶ୄथ HadoopǍʾ ڡѺଌᝏ Hadoop Ὂˀኮ௧ߦ˸ᤇ௧ၷ̗ऄၹὊˀԻᥘВښܸࠒ ੨ ভǍܥਓԣ᭤ਓǍਫ਼̿ੈ̓ᯫЏ᜶δࣱԼᄊ߷ЛভԣϤ 17 Ӭ䝋ೳ䈚 Hadoop ᑇৄᓔᬒП䏃 ʷᓊੈ̓͘नԧʷ̏ऄၹᤌଌ Hadoop ᄊ HDFS ҬὊඋݠெঃ᧔ 䴲⊩ᑨ⫼ⱘ䖲 ឭ௧ౝएˀ߷ЛᄊǍ ၹਗ਼ࡃԻ̿ᤰᤈ˔ጼቫ୲ͻ HDFS ᄊਫ਼దᠫູὊᤈࠫनஊࣱԼ ጼቫᤌଌ҂ੈ̓ᄊᬷᏆὊࣳ˅សၹਗ਼વదᤈ˔ጼቫᄊ root ᠍ਗ਼Ὂᥧ˦ស ࡃԻ̿᫈ᠫູǍϜၹਗ਼AᤰԳʷԼళᅼᄊLinuxڧڡԶ᜶ᅼ᥋Ҭ Hadoop ᬷᏆᳮᝣࣳࠫᤌଌХҬᄊ Linux ጼቫϢᢶ͋ᝣὊਫ਼̿ Linux 㒜ッⱘ䱣ᛣ䖲 ͋ᝣ᫈ᮥnjၹਗ਼ిᬍ᫈ᮥnjWeb ႍ᭧᫈҄᫈ᮥǍ ʷ̏߷Л᫈ᮥǍඋݠᢶښˀ͘Ꮶᘽܺܳ߷Л᫈ᮥǍͮ௧ Hadoop ᆸࠄߛ ொరࣳښ߲ᄊߛϲᑟҧ̿ԣѬ࣋रᝠካᑟҧὊᒰ࠶˞ڂК Hadoop ᦐ௧ ळڄ͊͵ጇፒ᜶ࠄဘनஊᦐ᜶Џᝍх߷Л᫈ᮥǍੈ̓ᅼ᥋ܸᦊѬ ߷Л ࠶ॢܳՑర፥ઐੇవǍ ࠇਗ਼ʽଢ̔᭤ᄱТᄊሮऀǍҒరᔵݞԻ̿ѓښᤉᛡᄣὊൣၹਗ਼ ࠄᛡᦡՌኮေǍेཀྵᤇᭊ᜶ࠫ٨ᠫູڄΙតΎၹ̿ԣࠫඈ˔ၹਗ਼ ᄬै̏דᄬैΙၷ̗Ύၹnj̏דࠇਗ਼ቫᄊѺరࡃᭊ᜶҄ࠀˑಫᄊᔵὊ थښਅˀՏὊॢԻᑟѣဘ౽˔ၹਗ਼ᄊ୲ͻΎ४ட˔ࠇਗ਼ቫǍਫ਼̿ СՏΎၹᄊৱцǍႀ̆ඈ˔̡ᄊ˸ڄ˔Ὂᤈ͘ѣဘ౽˔ࠇਗ਼దܳ ΎၹᄊৱцѬᦡᄱऄᄊࠇਗ਼ڄ˔˞̀ЍѬѾၹᠫູὊੈ̓ોིՊ ᄊͻၹǍ ၹܬѬᦡ҂ˀՏᄊࠇਗ਼ቫ٨ʽὊՏ˷ᡑ҂ڄࠇਗ਼ቫ٨Ὂ࠲ˀՏᄊ 18 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵 Իᑟᒱ task ᄊܳܿ᠌తጼ࠲ᒱͻˊᄊܿ᠌Ǎ ᦡᎶ̿ԣੱ࡙ӊᦡᎶˁᬷᏆ˗ᄊХ̵ᓬགˀʷᒱὊॢܒ̏ slave ᓬགဗ Ꮆ̀౽˔ task ࠽ត 4 ݠ౧ˀੇҪѷ࠲ᜂ˞ܿ᠌Ὂݠ౧ழҫᄊᤈ MapReduce ˗ੈ̓ ښʾὊᥧ˦࠲ᒱᤈ˔͈ᄊˏܿǍܱὊ ᤈழҫᄊ slave ᓬགʽὊ˨Ցݠ౧᭤ขၹਗ਼ਓ࠲ᤈ̏ᓬགՏښߛϲ ᦐ͋ܬ˔ၹਗ਼ҫї˔ళᅼᄊ slave ᓬག҂ᬷᏆ˗Ὂ্ݞ౽˔͈ᄊʼ ߛϲὊᬤՑᜂХ̵͋ܬ˔̿ԣͻˊᄊܿ᠌ǍϜੈ̓ඈʷ͈͋ᦐ௧ʼ ҫ҂ master ˗ᄊὊᤈನʷ̏ˀԻ᭥ᄊ slave ҫదԻᑟᒱᄊˏܿ ҒᄊᬷᏆ௧Ի̿ᬤਓ˨ښmaster ᓬགᆸࠀ̿ՑὊslave ᓬག ښᄊὊਫ਼̿ Hadoop ᄊː˔˟᜶ᦊѬ HDFS ֗ MapReduce ᦐ௧ master/slave ፇ slave 㡖⚍䱣ᛣ⏏ࡴ ᤈನᒱͥካᄊፇ౧ˀюᆸǍ ࣱԼϢੇవͥካઑ᛫͘࠲ A ๗Ᏺᄊᝠካᠫູᦐካ҂ B ၹਗ਼ᢶʽὊښ̄ ၹਗ਼дЍ B ၹਗ਼ଢ̔ͻˊὊ᫈వ A ၹਗ਼ࣳదిᬍ᫈ᄊὊХ రమ͢ᤵᄊᢶ͋ὊࡃԻ̿дЍសᢶ͋ᤉᛡͻˊଢ̔Ǎᤈ࠲ᒱХʷ A ଢ̔ MapReduce ͻˊὊԶ᜶࠲ user.name ᄊ࡛ভᎶੇͿ̓ੈښ ܙ⫼᠋䑿ӑⱘݦ ᄊˊҬઑ᛫Ǎ ־ᄊᝠካᠫູὊᒱெᄊˊҬᝠካ४ˀ҂ᡜܵᄊᝠካᠫູ̰Ꮻॖ ˀ߷ЛᄊǍՏᤇԻ̿नԧʷ̏ሓదᄊऄၹሮऀၹए๗ᏲࣱԼ ࡃԻ̿ड़ HDFS ʽߛϲnjξஈὊᤈನࠫဘదᄊ௧ౝХڧڡ ᄊࣱԼ˗ࣳࠫኄʼவऄၹϢᢶ͋ᝣὊ͊͵ APP Զ᜶ᅼ᥋ХҬ Ғ˨ښᬷጇፒ࠲ܱᦊᄊˊҬጇፒ᧔ᬷᄊெঃᄰଌʽ͜҂ HDFS ʽǍ 19 Ӭ䝋ೳ䈚 Hadoop ᑇৄᓔᬒП䏃 ˔ᄊᝠ˷ᥖ̰ Linux ͈Ǎඈ˔͈વద឴njиnjੰᛡʼመ୲ͻὊඈ HDFS ͈ጇፒᝠവલ̀ Linux ͈ጇፒὊਫ਼̿ HDFS ͈࡛ভ ⫼᠋㒘ֵᙃࠊ Kerberos ᄊԻ᭥ভǍ ̀ूܙᝣηৌὊᤈನܸܸ ᄊ᫈ᮥǍੈ̓ᦡᎶ̀˟̰ KDC ҬὊՏѾၹᑮవࠄՏ൦˟̰ःᄊ ˔ӭགᬪ᫈ᮥǍळКʷ˔ழᄊጇፒὊܳ̀ʷ˔ဗᓬॹཀྵࣜ͘ʷ̏ழ ʷښၹਗ਼ឰරҬᄊϋᭊ᜶Џ̰ KDC Ѭԧ TicketὊᤈನ KDC ߛ ᑮవԻ̿ʷ᪄ᝍхǍ Kerberos ʽѬᦡᄱऄᄊᢶ͋ηৌǍੈ̓ᤰᒭүӑ ښᓬགᭊ᜶ܙਗ਼njழ ၹܙӊեᢶ͋ᝣηৌὊKDC ˞ࠛᨅѬԧҬ٨ǍळК Kerberos ՑὊழ Ὂӊ Identity Store ֗ KDC ːᦊѬǍХ˗ Identity Store ˟᜶ڏ˟ʹፇ ਫ਼ᇨ˞ Kerberos ᄊڏᝣनթὊʽᤘଡᤘᄊᄱТ᫈ᮥᦐᑟ४҂ᝍхǍʾ Ցᄊྠవ˗ஃે̀ KerberosὊੈ̓࠲ Kerberos ߷Л̿ 1.0 ښ Hadoop ᓩܹ Kerberos 20 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵 ̆ʽ᭧Ѭౢᄊፇ౧Ὂੈ̓࠲ Hadoop ឴ԩၹਗ਼ጸηৌᄊଌᤉᛡ 4ὌኮေౝХˀவΧǍ ᄊిᬍᠫູᦡᎶᭊරข४҂໘ᡜǍాܭ3Ὄ ᬲǍڈ2Ὄࠇਗ਼ቫ٨͕ܳὊ࠱Тጇᄊʷᒱভඋᣗ ᬷᏆᄊ߷ЛǍڮ1Ὄ͊͵ࠇਗ਼ቫᄊˀ߷Лᦐ͘ᆡ ܫ˨˔ˀᡜ ʾї̿ښ፬ʽѬౢὊੈ̓ԧဘ Hadoop ᳮᝣ Linux ࠇਗ਼ቫᄊጸηৌߛ ৌఞஈඋᣗᮠጓᏫ˅ԡඋᣗᤕѭὊᤈನࡃ͘˞ᬲኮေրǍ ୲ͻኮေʽౝ˞ˀவΧὊదϋʷ̏ၹਗ਼ᄊిᬍηښৌЛᦊఞஈǍᤈನ ᤈ̏٨ʽ࠲ࠇਗ਼ቫᄊጸηښવదᤈ˔ၹਗ਼ᄊᄆैηৌὊཀྵՑѬѿᭊ᜶ їԼࠇਗ਼ቫ٨דᔪਇࠫ౽˔ၹਗ਼ᄊጸηৌᤉᛡఞஈὊᭊ᜶Џᅼ᥋ ˀʷᒱǍ ࠇਗ਼ʽ፥ઐʷ̏ᄱՏᄊၹਗ਼ጸηৌὊᥧ˦͘ᒱၹਗ਼ጸηৌᄊ˔ ܳښᜂᤵϜǍܱὊᤌଌ҂ࣱԼᄊࠇਗ਼ቫ٨᧚ᣗܳὊݠ౧ᭊ᜶ ܱὊݠ౧ጸηৌΚᠻ̆ࠇਗ਼ቫ٨ᄊភ˷ॢࠔΎၹਗ਼ᄊጸηৌ ˦ሮऀࡃ͘ઑᩲǍ ˷ࡃ௧ឭὊݠ౧ࠇਗ਼ቫၹਗ਼ᄊጸηৌˁ HDFS ʽᄊిᬍηৌˀӜᦡὊᥧ ిᬍѼலᄊϋᤰូၹʷ˔ଌᖍԩ Linux ࠇਗ਼ቫᄊၹਗ਼ጸηৌὊ ᤉᛡၹਗ਼ښᄊిᬍǍᏫ Hadoop వᢶࣳదߛϲၹਗ਼ᄊిᬍηৌὊᏫ௧ ੈ̓ԧဘὊHadoop ᔪ᜶͈ࠫࠄဘ༧ำᄊిᬍ҄ᭊ᜶Ꮆጸၹਗ਼ ᦡԂˀܵ༧ำǍ ԁʷСવద 9 መˀՏᄊᦡᎶኖ႕ǍᤈನᄊᝠඋᣗእӭὊͮ௧ХిᬍѬ ֗ጸၹਗ਼ᄊిᬍǍᤈನʼመˀՏᄊᢶ͋ࠫʼ˔ˀՏᄊ୲ͻᤉᛡଆѵጸՌὊ ͈Զᑟॆ࡛̆ʷ˔ਫ਼దᏨὊॆ࡛̆ʷ˔ጸὊඈ˔͈ᦐࠀ˧̀ਫ਼దᏨ 21 Ӭ䝋ೳ䈚 Hadoop ᑇৄᓔᬒП䏃 read write HadoopDPM MySQL ਗ਼Զᑟ᫈ХదిᬍᄊὊᤈನࡃԻ̿ࠄဘࠫ Web UI ᄊ᫈҄Ǎ ᫈ᮌᄆैǍܱὊၹਗ਼ᄊ token ˁၹਗ਼ᄊᢶ͋˷௧ፅࠀᄊὊၹ Ǎၹਗ਼ᄆैᝣᤰѷ͘ၷੇʷ˔᫈ token ࣳߛϲ҂ cookie ˗Ὂʾ ฌвႀၹਗ਼ᒭᛡࠀὊኮေրᤉᛡࠆښਗ਼Ր֗ࠛᆊὊសၹਗ਼Ր֗ࠛᆊ ኄʷ᫈ Web UI ႍ᭧͘ᒭүᣁ҂ʷ˔ᄆैႍ᭧Ὂ᜶රၹਗ਼ᣥКၹ ᧫ࠫᤈ˔᫈ᮥὊੈ̓ξஈ̀ Hadoop ᳮᝣᄊ Serverlet ٨Ὂၹਗ਼ ࣜ͘ʷ̏Х̵߷Л᫈ᮥǍ ःᦡᎶᄊ᠍Ղࠛᆊҫᣒ҂ᦡᎶ͈˗Ὂᤈನᤰ Web ႍ᭧ఒ᭛ѣὊ ਗ਼Ի̿ᄺ҂ਫ਼దͻˊᤂᛡηৌ̿ԣᄱТᦡᎶηৌὊదੈ̓͘࠲ʷ̏ HDFS ʽᄊὊᤈࠫʷ̏ሓదᄊ߷Лভ௧దδᄊǍܱὊၹ ηৌǍᳮᝣᤈː˔ႍ᭧ࣳళࠫၹਗ਼ᢶ͋ᤉᛡᝣὊ͊͵ၹਗ਼ᦐԻ̿᫈ Hadoop ᳮᝣଢΙː˔ Web ႍ᭧ 50030 ֗ 50070ၹΙၹਗ਼ಊវ Web UI 䆓䯂ࠊ Զᭊ᜶ᤰ DPM ࢺЦࡃԻ̿ᣐ૿ిǍ MySQL ˗ᄊၹਗ਼ጸηৌὊᤈನݠ౧ၹਗ਼ᭊ᜶᫈ʷ˔ழᄊᠫູὊኮေր வᤉᛡኮေǍԳܱὊੈ̓नԧ̀ʷ˔ DPM ࢺЦὊၹ̆ξஈڡ˔ʷښ˗ ௧ᤰ MySQL ःᖍԩၹਗ਼ᄊጸηৌὊᤈನਫ਼దၹਗ਼ᄊጸηৌᦐᬷ hadoop.security.group.mapping ᦡᎶ˞ᒭࣂࠄဘᄊʷ˔ዝὊΎ४ Hadoop ਫ਼ᇨὊ࠲ڏ᧘иὊ࠲ηৌેˤӑ҂ʷ˔࿘ቡᄊТጇः˗Ǎݠʾ 22 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵 ඈ˔ၹਗ਼ᄊΎၹ˸ਅˀʷನὊඈ˔ၹਗ਼ᄊ୲ͻඵࣱˀʷನὊݠ౧ 㾘㣗 ॢ࠶ᄊʷᦊѬὊᤇᭊ᜶̰ॢܳவ᭧ᤉᛡᔵ֗ᄣǍ ᑟनஊᏫ௧᜶नஊݞǍݠ͵੦ᑟᤂᖹݞʷ˔ Hadoop ࣱԼὊషड़ड़Ӵ ʷ˔Իၹᄊ࿄গǍेཀྵੈ̓రమᄊፇ౧ˀ̩̩௧̆ܫ௧ᡑ൦Ὂ̽᛫ࣱԼ ᝍх̀߷Л᫈ᮥԶᑟឭ Hadoop ࣱԼनஊѣԝ̀Ὂ߹ੇᤈ˔Զ ࣱԼᤂᖹ ኮေր ၹਗ਼ ᢶ͋ᝣ፥ઐᢶ͋ηৌ ࠇਗ਼ቫ٨ ᄆै ి ηৌ ፥ઐၹਗ਼ ឴ԩၹਗ਼ጸ Hadoop ၹਗ਼ηৌ Kerberos Web UI DPM Kerberos ˗ǍDPM ᤇଢΙၹਗ਼ฌвnjኮေրࠆnjઑҪᑟǍ ᛡిᬍѼலὊኮေրξஈၹਗ਼ηৌ 5 Ѭ᧿ՑࡃԻ̿ၷ҂ Hadoop ጇፒ ः˗ǍHadoop ᤰ MySQL ः˗ᄊၹਗ਼ጸηৌࠫၹਗ਼ᄊឰරᤉ MySQL ښKerberosnjၹਗ਼ηৌnj Linux ࠇਗ਼ቫὊХ˗ၹਗ਼ηৌߛϲ ὊХ˗ኮေրᤰ DPM ࢺЦԻ̿୲ͻڏਫ਼ᇨ˞டʹ߷Лڏʾ ᅝܼᶊᵘ 23 Ӭ䝋ೳ䈚 Hadoop ᑇৄᓔᬒП䏃 ፇ౧ૉᬷᏆԠូ͖Ǎ Ꮸ࠲૯ܿత࠵ӑǍᄣࠄᬅʽ˷௧ࠫဘదҬᄊѬౢὊԻ̿ᤰᄣᄊ ԧၷҒԧဘҒЎὊ᧔ԩଐஷᥘВᤵੇ૯ܿὊੋ̃ښਫ਼់ᄣࡃ௧ ⲥ ᤂᖹৱцǍ ౽መሮएʽԻ̿ԦࣱԼᄊښ፞ழืሮᄊᮋѾੰᛡǍืሮࠄஷᄊৱц Ὂᤈನ੦ᑟδՑેڲંଧͱТ᪄ᄊགࡃݞǍืሮʷேᛡὊʷࠀ᜶ ˁၹਗ਼ᄊˀॹ᜶ᤰǍืሮᄊ҄ࠀᭊ᜶ંଧՌᤠᄊएὊˀᑟܳདဵὊ ᫈ᄊ߷ЛǍࣱԼԧ࣋ழྲভᭊ᜶ᡌӤጟืሮǍืሮѓ࠶̀ኮေ̡ր ʷᤉКࣱԼࡃЩੇᓢݞᄊ˸ਅǍၹਗ਼ႂឰిᬍᭊ᜶ืሮὊδښ ၹਗ਼ฌвழ᠍ਗ਼ᭊ᜶ืሮὊᤈನ͘ΨΎၹਗ਼ЏࣱྀԼᔵὊՏ ⌕ ࠄ˗ˀல߹ؓǍښ҄ࠀˀරʷ҂ͯὊԻ̿ ՐǍᔵδ̀ᬤᅌᫎᄊረὊࣱԼΚைϸѸᦊᎸʷನǍᔵᄊ ᄊˊҬᏫࠀǍेཀྵੈ̓ᤇᎶ֑̀ՐᔵὊਫ਼దᄊᄬैᦐၹ࠵иߚඇ֑ ᄬैὊݠ /tmpnj/commonnj/tmpnj/warehouse ὊЦʹ᜶ಪඈ˔НՃ user ʾǍेཀྵᤇదХ̵ ښˊҬηৌὊጳʽᄊˊҬηৌˀЊஊᎶڄϲ ᄬैὊᦡᮩඋᣗܸὊၹ̆ߛڄ˞ᮩὊΙၹਗ਼ߛϲ˔̡Ὑ work ᄬै थ̀ʷ˔ user ֗ work ᄬैὊХ˗ user ˞ၹਗ਼ᄊሓ̡ቇᫎὊᎶʷࠀᦡ ಪᄬैѹښඈʷጟᄊᄬै̿ԣᄱऄᄊˊҬե˧Ὂ̰ၹਗ਼ᝈएᄺὊੈ̓ ҄ࠀߛϲᔵ֗ᝠካᔵǍHDFS ௧ʷ˔͈ጇፒὊ̰ಪᄬैनݽ҄ࠀ ေրᄊ፥ઐੇవǍHadoop ˟᜶ଢῚː˔ҬߛϲˁᝠካὊੈ̓ᭊ᜶ ҫၹਗ਼˨ᫎᄊᤰੇవὊ͘ଢᰴኮܙదፒʷᄊᔵ͘ΎࣱԼˀݞၹὊ͘ 24 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵 ࣱԼनஊᒰ̭ܸܸ࠵࠵ᄊ᫈ᮥ˷᥅҂ˀ࠶Ὂඈʷ᫈ᮥᄊѣဘᦐ͘ ፇ ͻˊࠫᬷᏆԠᤉᛡូடǍ ๗ᏲᄊᝠካᠫູὊಪԋԾڄ˔ѬౢԋԾͻˊᤂᛡৱцὊካඈ ग़Ϯߚᵤ ReduceSlot Ǎ ReduceSlot ᄊ๗ᏲඋὊ̿ូடѵᄊԠὊ̿ԣ MapSlot ˁ ᄣѵᄊࠄូएηৌὊඈ˔ͻˊᄊॠηৌὊ̿ԣ Mapslot ֗ 䯳߫ⲥ ࠫၹਗ਼ᄊ HDFS ᦡᮩᤉᛡᄣὊଢᧈၹਗ਼ԣຍေరǍ 䜡乱ⲥټ⫼᠋ᄬ Ցї˫దˀྀᄊपηৌǍ ࠫ NameNodenjDataNode ᄊெঃ˗ᄊपࠀరᤉᛡѬౢὊѬౢ҂త ᓖᐌߚᵤ ᮩᬍ҄ὊԣઑǍ ᭊ᜶ᄣඈ˔ᓬགᄊඈ˔ᇓᄨᄊЦʹΎၹৱцὊࣳࠫᇓᄨΎၹϢᦡ ⺕ⲬՓ⫼ⲥ ᄊೝಊǍ ᄊᓬགళԣԧဘὊᤵੇ૯ܿǍྲѿ௧ SecondNameNode ߛำ࿄ц ᎄиᑮవᄣՊᓬག௧աߛำὊ̿ԣᄱऄᄊઑଐஷὊ̿Вܸ᧚ ᒋⲥع⚍㡖 25 Ӭ䝋ೳ䈚 Hadoop ᑇৄᓔᬒП䏃 ࣱԼ࣎Ὂழ๎ॲӰcloudfireǍ៰ژ㗙ҟ㒡˖ϬౕὊ͖ᦺ 1 ˔̡ெᤉᛡெ፥ઐԁԻǍ ʷСᤂᛡᄊͻˊᡔ 200 ʺǍᄬҒࣱԼᤂᖹඋᣗሷࠀὊඈևԶᭊઆК ၹὊᡔ 100 ܳ˔ฌвၹਗ਼Ὂඈܹଢ̔ͻˊᡔ 7000ὊࣱԼᤂᖹᒰ̭ Ύښڄ˔ܳ ࣃፃద 20ڄΎၹࣱԼὊᄬҒЛᬷښڄ˔ གǍ̰తѺᄊ 1 ᄣܙΨΎੈ̓নᏦ௧աᭊ᜶ξஈืሮnj௧աᭊ᜶ҫᔵnjᭊ᜶᜶ழ 26 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵 heartbeat …… heartbeat HAagent SlaveNameServer NameServer sdb sdc dsp1 dsp2 dsp3 DataServer sda dsp2 dsp3 sdb sdc dsp1 DataServer sda blockid,fileid heartbeatmessage Data controlmessage DataServer id blockid/ Application/Client ਫ਼ᇨǍڏጸੇὊݠʾ TFS ᬷᏆႀՐߚҬ٨NameServer֗Ҭ٨DataServer እ̮ http://code.taobao.org/p/tfs/src/ὊଢΙፌܱᦊၹਗ਼ΎၹǍ TaoCode ʽनູᮊᄬ˟ᮆ ښతܸᬷᏆߛϲ͈ࣃᤃӢ̣ǍTFS ࣃ ࠃᄊՊᮊˊҬ˗ὊᄬҒࣃᦊᎸᄊښऄၹڡߛϲҬǍTFS ᜂࣹฅ ࣋र͈ጇፒὊ۳̆ᤰᄊ Linux Ҭ٨थὊ˟᜶ଢΙ๒᧚᭤ፇӑ TFSTaobao File System௧ʷ˔ᰴԻၹnjᰴভᑟnjᰴԻੱ࡙ᄊѬ ᓴটϰ ࠃ๒᧚͈ߛϲࠄ Яߛ˗ὊࣳˀᤉᛡેˤӑߛښNameServer ʽᄊਫ਼దЋηৌᦐδߛ ߛϲ҄ ὊΎ४ block ҞవˀͰ̆ᬷᏆᦡᎶϙὊδጇፒߛϲᄊԻ᭥ভǍ҄ ܭѷᝣ˞ DataServer ࣃፃὊ͘࠲ស DataServer ʽߛϲᄊ block ᤉᛡ DataServerὊे NameServer ᡔʷࠀᫎదஆ҂ DataServer ᄊηৌὊ NameServer ԧॷηৌὊ NameServer ѷಪॷኮေਫ਼దᄊ ፌڡթүՑὊ͘Ք NameServer ලઑХߛϲᄊਫ਼ద block ηৌὊࣳևరভ ᇓᄨὊ̿ΧЍѬѾၹᇓᄨ IO ᠫູǍDataServerڱᤉሮὊඈ˔ᤉሮኮေʷ ʷԼ٨ʽᦊᎸܳ˔ DataServerښDataServer Ҭ٨ᦊᎸᤰ͘ ҬᄊᰴԻၹভǍ ၹҬ٨ࡃѭ૱˞˟Ҭ٨ଌኮҬὊ̿δܬၹҬ٨ʽὊܬ҂ NameServer ᄊ࿄গὊेХೝ҂˟Ҭ٨Ὂ HA agent ࠲ vip ѭ૱ ܬ˟ၹҬ٨Ὂ HA agent ᠇᠊ᄣܬࣳ࠲ block ξஈηৌՏ൦ᒰ Ҭ٨С̚ʷ˔ vipǍৱцʾὊ˟ NameServer ેద vip ଢΙᄊҬὊ NameServer ҬᦊᎸ᧔ၹ HA ᥘВӭགᬪὊːԼ NameServer ᄊ͈Ǎ ߛϲ͈ѬᦡὊblock id ֗ file id ጸੇᄊ̄Ћጸʷಖគʷ˔ᬷᏆ᧗ښ ˗ᄊ͈વదʷ˔ block Яʷᄊ͈ᎄՂfile idὊfile id ႀ DataServer ᎄՂblock idὊblock id ႀ NameServer ѹथѬᦡὊblockڱʷᄊ ᬷᏆ˗વదЛࡍښ ˀՏᄊʽὊ̿δᄊᰴԻ᭥ভǍඈ˔ block block ˗ࠀ͈ͯǍඈ˔ block ͘ߛϲܳ˔Ҟవ҂ ښथቡጊळὊ̿Χঌᤴ Տʷ˔ block ˗Ὂࣳ˞ blockښ64MBԻᦡᎶὊTFS ͘࠲ܳ˔࠵͈ߛϲ block˞ӭͯߛϲ֗ጸጻὊblock ܸ࠵ᤰ˞ڱTFS ̿ 137 ᅲ䏉ټᅱ⍋䞣᭛ӊᄬ⎬ ӱҿѕ߆ԇ֘݇ރ ݐଟޞstep4:գ ўކstep1:બࡌӊ ॆद,ଏ֛ ӊҵ݇ރ ۨ؏ӊݱҁ ࣍ߎेઍ blockޏ step6: ݕсӊબࡌ ӊ؏ࡂ݇ރ:step5 ঀӊৈߧۯؚ step7:ଏ֛ ॆद,ଏ֛ ӊҵ݇ރ ӱҿѕ߆ԇ֘݇ރ ݐଟޞstep4:գ DataServer(Replica2)DataServer(Replica3) step3:էmasterՇଟӊબࡌ ێङblockͫଏ֛blockѹҒ step2:ଣӟ▲ЗՕӊ Client DataServer(Master) NameServer block ЯᦊᄊϠረͯᎶ̿ԣ͈ᄊܸ࠵Ǎ ښळ˗ᝮै̀ block ˗ඈ˔͈ DataServer ʽὊඈ˔ block ࠫऄʷ˔ጊळ͈Ὂጊ ښ᫈҂ߛϲᄊ͈Ǎ idnjfile id ᎄᆊጸੇᄊ͈ՐὊ̿Ց Client ᤰស͈ՐԁԻ̰ TFS ʷ˔ႀᬷᏆՂnjblockڀ͈ੇҪи҂ܳ˔ DataServer ՑὊ͘Ք Client ᤅ ᛦ౧ࣳˀݞǍेکࠄᄊὊಪᄊηৌѬᦡ blockὊࠄХ ὊՏႀ̆᠇ᣒηৌˀ௧ాܭDataServer ᠇ᣒѬᦡᄊኖ႕Ὂࠄဘʽᣗ ᧔ၹ round robin ᄊኖ႕ѬᦡὊᤈመኖ႕እӭదǍХ̵ಪڡӭ ѬᦡԻи block Ὂእښ ਫ਼ᇨὊNameServerڏTFS и͈ืሮݠʾ ᛦǍکᛦὊረδҬ٨ᫎᄊ᠇ᣒکԧဘ DataServer ᠇ᣒˀښ blockᤰ௧ႀ DataServer ᒱᄊὙ ҄ܭԧဘ block Ꭵ࠶Ҟవ ښৌǍNameServer ద˄᫃ᄊՑԼጳሮᣃវՊ˔ block ֗ Server ᄊ࿄গὊ ࠱Тጇ᛫Ὂ˞иឰරѬᦡԻиᄊ blockὊ˞឴ឰරಊវ block ᄊͯᎶη NameServer ಪᤈ̏ηৌथ block ҂ Server ᄊ࠱Тጇ᛫Ὂಪស ϲǍDataServer թүՑὊ͘࠲Хવదᄊ block ηৌලઑፌ NameServerὊ 138 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵 NameServerǍ tair ᎁߛᄊ֑˗ဋ᭤ᰴὊΎ४ፐܸᦊѬᄊ឴ឰරᦐˀᭊ᜶ፃ Ѭ࣋र key/value ߛϲጇፒὊhttp://code.taobao.org/p/tair/wiki/intro/Ὂऄၹ tair ˗tair ௧ࠃनູᄊ ښߛὊ࠲ block ҂ DataServer ᄊ࠱Тጇᎁߛ ᎁߛᄊ֑˗ဋࣳˀᰴὊ˞ TFS ᤇஃેᤊቫᎁڡదӢʺὊ̰Ꮻᒱవ ᎁߛᄊЯߛࣳˀܸὊᏫᬷᏆ˗ block ᄊ᧚ڡູඋᣗదᬍὊᑟܵၹ̆వ ࠄᬅऄၹ˗Ὂᤰࠇਗ਼ቫᑟΎၹᄊጇፒᠫښ̰తழᄊͯᎶʽ឴ԩ͈Ǎ DataServerὊࠇਗ਼ቫతጼ̰͘ NameServer ᖍԩ block తழᄊͯᎶηৌὊ DataServer ʽ᫈ᄊ͈Ǎݠ౧ cache ࣃፃܿ block ᜂረ҂Х̵ cache ֑˗ὊܸᦊѬৱцʾᦐᑟ̰ cache ᧗ᖍԩ҂᜶̰ ڡਫ਼̿ʷேవ ԧၷረᄊϋ੦͘ԧၷԫӑὊښDataServer ᄊ࠱ТጇʷᓊԶ͘ Ὂႀ̆ block ҂ڡClient ͘࠲ block ҂ DataServer ᄊ࠱Тጇᎁߛ҂వ failoverǍ˞̀ଢᰴ Client ឴ԩ͈ᄊဋࣳᬌͰ NameServer ᄊ᠇ᣒὊ ˟үᤉᛡ᠌ܿښClient ᠇᠊߹ੇ឴иѻ TFS ͈ᄊ۳వᣤὊࣳ Ǎ־ᄊҬᤵੇॖ ᫈ͰరᤉᛡὊ̿ᥘВХࠫښѻᬔ͈ӴၹᄊቇᫎὊѻᬔ͊Ҭᤰ ஆڀᄊ͈᧚ᡔʷࠀඋΓὊࠫ͘ block ᤉᛡடေcompactὊ̿ block ᧗ѻᬔὊԶ௧˞͈Ꮆʷ˔ѻᬔಖᝮǍेʷ˔ block Яᜂѻᬔ े Client ѻᬔ TFS ᧗ᄊ͈ὊҬ٨ቫࣳˀ͘ቡԁ࠲͈̰ ፌ ClientǍڀ҂͈ᄊͯᎶὊ̰ block ᄊᄱऄͯᎶ឴ԩࣳᤅ DataServer ଌஆ҂ Client ᄊ឴ឰරὊᤰಊ block ᄊጊळࡃᑟঌᤴ४ ԧ឴ឰරǍݠ౧̰౽˔Ҟవ឴ԩܿ᠌Ὂ Client ͘᧘តХ̵ᄊҞవὙ ᄊ DataServer ηৌὊࣳՔ DataServerښৌὊ̰ NameServer ʽಊវ block ਫ਼ ᄊ block ηښे Client ឴ԩ͈ὊᯫЏಪ͈Րᝍౢѣ͈ਫ਼ 139 ᅲ䏉ټᅱ⍋䞣᭛ӊᄬ⎬ ᭊ᜶ஃેʷ᫃ழឦᝓ᫈ TFS ᄊੇవ˷ԫ४᭤ͰὊԶᭊ᜶ોིөᝬ ʽጳΎၹՑὊTFS ᄊਫ਼దጸ͈ӤጟᦐᑟϢ҂ࠫၹਗ਼ᤩὊՏڱNginx വ ေਫ਼దᄊ TFS ឴иឰරὊՔၹਗ਼ଢΙ RESTful ᄊ᫈ଌǍ̽ڱᮥὊសവ github नູὊhttps://github.com/alibaba/nginx-tfsᝍхស᫈ ښࣃڱവ ˔ऄၹӤጟࠇਗ਼ቫὊӤጟੇవ᭤˨ᰴǍTFS ᤰनԧ Nginx ࠇਗ਼ቫ Ύၹᄊᄈښࠇਗ਼ቫԧ࣋ፌၹਗ਼ΎၹՑὊʷேԧဘ bugὊᭊ᜶ᤰᅼࣃፃ ࠄဘࠇਗ਼ቫ᫈ՑቫҬ٨ᄊᣤǍेڡܭ᧘ښ᫃ழឦᝓᄊஃેὊᦐ௧ ҫʷܙ TFS ˷ଢῚ Java ࠇਗ਼ቫὊඈڂҬ˟᜶Ύၹ Java ᤉᛡनԧὊ TFS ଢῚಖюᄊ C++ࠇਗ਼ቫΙनԧᏨΎၹὊՏႀ̆ࠃЯᦊˊ ੇܸ͈Ǎ ྟࠫऄᄊ TFS ͈ՐηৌὊг̰ TFS ᧗឴ѣՊ˔ѬྟᄊὊ᧘ழጸՌ ᄊ௧ܸ͈ᄊѬྟηৌὊेၹਗ਼ܸ͈᫈ὊClient ͘Џ឴ѣՊ˔Ѭ ᄊ͈Րស͈Րˁᄊ TFS ͈దᅌˀՏᄊҒ፰Ὂ̿ӝѬХߛϲ ͈ՐὊཀྵՑ࠲ܳ˔͈Րͻ˞ழᄊ͈ߛϲ҂ TFSὊ४҂ʷ˔ழ ࠵͈ᤰඈ˔ 2MBѬྟὊࣳ࠲ඈ˔Ѭྟᦐߛϲ҂ TFSὊ४҂ܳ˔ ̰ TFS ᧗឴ԩ͈Ǎܸ͈ࠫ̆ᄊߛϲὊClient ͘࠲ܸ͈ѭѬ˞ܳ˔ ᄊ͈Ὂѷ͘Џ̰ metaserver ಊវស͈Րࠫऄᄊ TFS ͈ՐὊཀྵՑ ͈Րˁ TFS ͈Րᄊ࠱Тጇߛϲ҂ metaserverὊे឴ԩᒭࠀ͈Ր ࠲Хߛϲ҂ TFS ˗Ὂ४҂ʷ˔ႀ TFS Ѭᦡᄊ͈ՐὊཀྵՑ࠲ၹਗ਼ૉࠀᄊ ͈Րᄊ࠱ТጇǍेၹਗ਼ߛϲʷ˔ૉࠀ͈Րᄊ͈Ὂ Client ᯫЏ TFS ଢΙӭ࿘ᄊЋҬ٨ metaserverኮေᒭࠀ˧͈Ր҂ TFS ଢΙழᄊҬnj࠰ᜉ Client ࠄဘǍࠫ̆ᒭࠀ˧͈ՐᄊߛϲҬὊ ఀࣳదஈԫ TFS Ҭ٨ቫᄊߛϲ҄ὊᏫ௧ᤰڤેὊஃેᤈːመˊҬ ಪˊҬᄊᭊරὊTFS ᤇࠄဘ̀ࠫᒭࠀ˧͈Րܸ͈֗ߛϲᄊஃ 140 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵 ᣤᬷᏆǍ ᄊᤰᬷᏆᫎᄊՏ൦҄δ̉˞᪫ϸὊੇʷ˔ܸᄊ ՊᦊᎸʷ˔ TFS ྭေᬷᏆὊܳ˔ྭေᬷᏆ˔ܳښ༫ὊЦʹϢข௧ TFS ᬷᏆᤰܳҞవ҄δᄊԻ᭥ভὊՏஃેܳࠔ ࠔ༫ ᝨட˔ᬷᏆᄊҬဋతܸӑǍ ʷ˔ࡊ᧚ଌᤃᄊඵͯጳʽὊښᏆ᧗ਫ਼ద DataServer ᄊࠔ᧚Ύၹৱцδે ̰ࠔ᧚ᣗᰴᄊ DataServer ረ҂ழੱࠔᄊ DataServer ᧗ὊతጼΎ४ᬷ ᛦὊ࠲ᦊѬکܸǍ᧫ࠫᤈመৱцὊ NameServer ࠫ͘ட˔ᬷᏆᤉᛡ᠇ᣒ ᠇ᣒʽࣀᡰ˷ॢڂࠔ᧚ΎၹʽˁᬷᏆ᧗Х̵ᄊ DataServer ࣀᡰॢܸὊ ښХߛϲᄊ᧚۳వնඋТጇǍेழ DataServer ҫКᬷᏆὊХ ͈བྷགဘ៶Ὂਫ਼̿ DataServer ᄊ᠇ᣒˁښ᫈ឰර᭤ᬤὊ۳వˀߛ ҂ TFS ʽᄊູ͈ڀႀ̆ TFS Ғቫదࠃ CDN ᎁߛὊతጼ DataServer ࡃԻ̿नݽଢΙ឴иҬ̀Ǎ ழᄊ DataServer ʽѹथʷ block ၹ̆ଢΙи୲ͻὊழੱࠔᄊښ͘ DataServer ҬԁԻǍे NameServer ਖᅼ҂ழᄊ DataServer ҫКᬷᏆὊ Ὂթүܒݞੱࠔᄊழ٨ὊᦊᎸ DataServer ᄊᤂᛡဗܬ̡րԶᭊ᜶ю ᒰТ᧘᜶ǍTFS ࠫᬷᏆᄊੱࠔஃે᭤ԤݞὊेᬷᏆᭊ᜶ੱࠔὊᤂ፥ ࠫ̆ߛϲጇፒᏫᝓὊᬔ̀δᄊԻ᭥ߛϲܱὊஃેࠔ᧚ੱ࡙˷ ࣱੱࠔ Ք Nginx ̽ေԧ HTTP ឰරԁԻǍ 141 ᅲ䏉ټᅱ⍋䞣᭛ӊᄬ⎬ ᬷᏆὊស͈ᤇద̰˟ᬷᏆՏ൦Ὂܬˀ҂͈សྭေᬷᏆԻᑟ௧ ࠇਗ਼ቫ឴ԩ͈Ὂ͘ᤥહሏᒭࣂతᤃᄊྭေᬷᏆᤉᛡ឴ԩὊݠ౧឴ԩ ࠫ̆ஃેܳࠔ༫ᄊᬷᏆὊTFS ࠇਗ਼ቫଢῚ failover ᄊஃેὊ ᬷᏆՏ൦ὊᏫ᧫ࠫϦՂ block ᄊи୲ͻႀ 2 ՂᬷᏆՔ 1 ՂᬷᏆՏ൦Ǎ ԶѬᦡϦՂᄊ block idὙ᧫ࠫ݉Ղ block ᄊи୲ͻႀ 1 ՂᬷᏆՔ 2 Ղ и͈ښи͈ԶѬᦡ݉Ղᄊ block idὊᏫ 2 Ղ˟ᬷᏆښՂ˟ᬷᏆ ᬷᏆોིૉࠀᄊѷѬᦡ block id ၹ̆и୲ͻǍ̿ː˔˟ᬷᏆ˞ΓὊ 1 ေᬷᏆǍ˞̀ᥘВܳ˔˟ᬷᏆՏиʷ˔ block ᤵੇᄊифቊὊඈ˔˟ ေᬷᏆ௧ࠫᄊὊՏܱࠫଢΙ឴иҬὊࣳ࠲и୲ͻՏ൦҂Х̵ᄊྭ Ꮖదи୲ͻὊѷᭊ᜶᧔ၹܳ˔˟ᬷᏆᄊᦊᎸவरὊԁᣤᬷᏆ᧗ඈ˔ྭ ᄊ TFS ᬷښᄊࠔ༫Ὂݠ౧ܳ˔ᄊऄၹᦐࠫਫ਼ڡࠫ̆प NameServer sdb sdc dsp1 dsp2 dsp3 DataServer sda dsp2 dsp3 sdb sdc dsp1 DataServer sda NameServer sdb sdc dsp1 dsp2 dsp3 DataServer sda dsp2 dsp3 sdb sdc dsp1 DataServer sda ેʷᒱᄊ࿄গǍ ᬷᏆʽߛϲᄊ͈δܬ˟ᬷᏆʽὊδܬெঃ᧗ᄊ͈୲ͻऄၹ҂ иnjѻᬔ Ὂࣳႀ DataServer ᄊՑԼጳሮ᧘ஊெঃὊ࠲ی͈୲ͻዝ ऄᄊ DataServer ᝮैՏ൦ெঃὊெঃӊե͈ᄊ block id ֗ file id ̿ԣ ᬷᏆԶଢΙ឴ҬὊ˟ᬷᏆʽߛϲᄊਫ਼ద͈ᦐ͘ႀࠫܬΙ឴иҬὊ ਫ਼ᇨὊ˟ᬷᏆՏଢڏὊݠʾܬܳ˟ᄊᣤᬷᏆᦊᎸவर˞ʷیЧ 142 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵 үᤉᛡੱࠔǍ ٨ᬪὊᒭү࠲ХʾጳὙेԧဘᬷᏆᄊࠔ᧚ΎၹᡔጳὊ˟ ᄣҬ٨ᄊҬ࿄গnjᄣᬷᏆᄊࠔ᧚ΎၹৱцὊेԧဘదᇓᄨੋ ਫ਼దᄊ TFS ٨ʽᦊᎸᄣሮऀὊښ˞̀ࡊொԧဘ᫈ᮥὊᤂ፥̡ր͘ Ꮖʽ᫈Ǎ ѣဘ᧘ܸ᫈ᮥὊԻ̿ξஈ MySQL ᄊᦡᎶὊ࠲ऄၹѭ૱҂ᄊᬷ ፌ ClientǍඋݠे౽˔ᬷᏆڀrcserver ࠲᧫ࠫសऄၹᄊతழᦡᎶηৌࣜ keepaliveὊClient ࠲ऄၹ឴и͈ᄊፒᝠηৌලઑፌ rcserverὊ ڡভ ηৌὊಪᦡᎶηৌ᫈ TFS ᄊҬὙClient ˁ rcserver ᫎ͘ևర TFS ࠇਗ਼ቫթүὊ͘ಪ appkey ̰ rcserver ʽᖍԩऄၹᄊਫ਼దᦡᎶ Ѭᦡʷ˔ appkeyὊՏಪऄၹᄊᭊර˞ХѬᦡᬷᏆߛϲᠫູǍे ᬷᏆᄊ᫈ిᬍǍेЯᦊऄၹᭊ᜶Ύၹ TFS ὊTFS ͘ፌඈ˔ऄၹ ေᬷᏆὊඈ˔ྭေྭ̏דMySQL ः᧗Ὂඋݠʷ˔ᣤᬷᏆ᧗ద ᬷᏆʽጳႀᤂ፥ኮေ̡րҫ҂ښ ඈ˔ᬷᏆᄊᦊᎸηৌ͘ ٨ʽጳὊᄰଌಪവၷੇᦡᎶ͈ǍښᦡᎶവὊ MySQL ः᧗ᦐదʷݓ ښҞవǍ˞̀ᥘВᦡᎶ͈ᩲឨὊඈ˔ᬷᏆ ߛϲ 2 ˔ҞవὊᏫదᄊᬷᏆѷ᜶රఞᰴᄊԻ᭥ভὊඈ˔ block ߛϲ 3 ˔ ᦡᎶʽᤰ௧ˀՏᄊὊඋݠదᄊᬷᏆ᜶ර blockښ࿘ቡᄊܳ˔ᬷᏆ rcserverᤉᛡፒʷኮေǍ MySQL ः᧗ὊᤰᠫູኮေҬ٨ ښTFS ࠲ਫ਼దᄊᠫູηৌߛϲ ࠃЯᦊᦊᎸదܳ˔ᬷᏆnjʽӢԼҬ٨Ὂదᄈ˔ऄၹ᫈Ὂښ TFS ᤂ፥ኮေ ࠇਗ਼ቫ͘࠽ត̰Х̵ᄊྭေᬷᏆ឴ԩ͈Ǎ 143 ᅲ䏉ټᅱ⍋䞣᭛ӊᄬ⎬ ຺dฉ zyd_cuēఆ୷ϐᅖdדޥ೬ωူӖޏْᄵ ಬߌܫᅟಀԅྡྷ໔ၗົંēڑฉ౨ْᄵิc॓ܚᅟڑܫ ཙࡎēંၽඇͯۢຂහϦϵူ TFS ԅ֟ٝᆴdڑϣӉ҅ອד Ӗ༰dཙࡎಓࠈխᅖྑҶಹޏᄯࢳܟ㗙ҟ㒡˖ ဖՊēΘྜဟ ᄊߛϲੇవᬌͰ 25%~50%Ǎ ኖ႕Ὂសᮊᄬᄊʽጳᮕᝠ͘Ύ TFS͋ܬ҂ጇፒ˗Ὂၹ̆̽ణ͜ፒᄊҞవ ࠲ Erasure code షऄၹܬဋnjᬌͰߛϲ̿ԣᤂ፥ੇవǍᄬҒ TFS ю δԻ᭥ভᄊ۳ᆩʽଢᰴҬښTFS ˟᜶ԧ࡙வՔʷᄰ௧ ళࢺͻ 144 ᭄ᯊҷⱘ IT ᶊᵘ䆒䅵