Youmayhaveheardthehubbubinrecentmonthsovertheopen A I G p T toourthemthatallowthemtoproducefakenewsandalsoallowsustodothingslikesentimentclassificationaswellassomethingmoremathematical, whichisrepresentingstringsofcharacters, wordsasmathematicalconstructsthatallowustodeterminerelationshipsbetweenthosewords.
Andsowhatthatallowsyoutodoistotakedotproductsbetweentwoarbitrarywordsinyourdictionaryandyougetnonzerocomponents, andsothatwhatthatmeans, inpracticalterms, isthatyouget a sortofsemanticrelationshipbetweenwordsthatemergesis a consequenceoftrainingyourmodel.
Sothewayitworksinpracticeiswe'regonnahave a wholebunchofreviewsfromtheIMDBdataset, andtheywillhavesomeclassificationsas a goodorbadreview.
So, forinstance, youknow, uh, fortheStarWarslastJedimovie.
Soifyoudid a hugenumberofreviewsforthelast 10 eye, youwouldsee a strongcorrelationofwordssuchashorriblebad.
Wouldn't charactersMarysuethingslikethat?
Andsothemodelwouldthen, uh, takethosewordsrunningthroughtheembeddinglayerandtrytocomeupwith a predictionforwhethernotthatis a goodorbadreviewandmatchituptothetraininglabelandthendobackpropagationtovarythoseweightsinthatembeddinglayer.
Youareabletopredictwhetherornot a reviewispositiveornegativeabout a particularmovie.
Butalsoitshowsyoutherelationshipbetweenthewordsbecausethemodellearnsthecorrelationsbetweenwordswithinreviewsthatgiveiteither a positiveornegativecontext.
Sothatisword M beddingsin a nutshell, andwe'regonnagoaheadandgetstartedcodingthat.
Sothefirstthingwe'regonnahaveis a onembeddinglayer, andthisisjustgonnabeforillustrationpurposes.
Sonowlet's gettothebusinessofactuallyloadingourdatasetanddoinginterestingthingswithit, Soah, wewanttousethedatasetloadfunctionsowell, say, traindateoftestdataandsomeinfo T f d.
S thatload.
IMDbreviewsfestslashsubwords.
Eight.
Okay.
Andthenwewilldefine a split, andthatis T f D s dotsplitthattrain t s notsplitthattest, andwewillhave a coupleotherparameterswithinfoequalstrue, thenincorporatesinformationaboutthe, umaboutthedatasetsand, assupervisedequalstruth.
So, assupervisedtellsthedatasetloaderthatwewanttogetbackinformationintheformofdataandlabelas a topull.
Letmemovemyuglymugoversowecouldsee a littlebitmoreandletusrunthesoftwareandseewhatweget.
Okay, soithasstartedtrainingandittakesaround 10 to 11 secondsperepoch.
Sohe's goingtosithereandtwiddlemythumbsfor a minuteandfastforwardthevideowhilewewait.
So, ofcourse, onceitfinishrunning, I realize I have a typo, andthatistypical.
SoinLine 46 itis.
Itis.
I spelledoutplotandsaidappeal T, butthat's allright.
Let's take a lookatthedatawegetintheterminalanyway, soyoucanseethatthevalidationaccuracythroughout 92.5% prettygoodandthetrainingaccuracyisaround 93.82 So a littlebitofovertraining, and I'verunthis a bunchoftimesandyoutendtoget a littlebitmore.
We'retraining.
I'm kindofsurprisedthatthisfinalnowthat I'm runningoverYouTube, itisactually a littlebitlessovertraining.
Okay, sobeforewedothat, youknow I wanttocleanupthecodefirst.
Let's goaheadanddothat, soah, I willleaveinallthatcommentedstuff.
Butlet's define a fewfunctions.
We'llneedfunctiontogetourdata.
We'llneed a functionthio, getourmodelandwe'llneed a functiontoplotdataand I'llneed a functionThioget R M beddingswillsayretrieveandbeddings, and I'llfillintheparametersforthoseaswegoalong.
Thepurposeofthisfunctionistoahtakearencoderanddumpitto a T S V filethatwecanloadinto a visualizeerinthebrowsertovisualizetheprincipalcomponentanalysisoffourwordandcoatings.
Let's take a lookattheplot, andtheplantistotallyconsistentwithwhatwegotthelasttime, youknow, inincreasingtrainingaccuracyand a levelingoffvalidationaccuracy.
Andthisisjustthevisualrepresentationofwhattheword M beddingslooklikein a reduceddimension, representationofitshigherdimensionalspace.
So I hopethathasbeenhelpful.
I thoughtthiswas a reallycoolproject, just a fewdozenlinesofcode, andyougettosomethingthatisactually a reallyneatkindof a neatresultwhereyouhave, um, a higherdimensionalspacethatgivesyoumathematicalrelationshipsbetweenwords.
Anditdoes a prettygoodjoboflearningtherelationshipsbetweenthosewords.
Andwhat's interestingis I wonderhowwellthiscouldbegeneralizedtootherstuff.
Soifyoudon't knowwhatanencoderis, thebasicideaisthatitis a ahsortofreduceddimensionalrepresentationof a setofwords.
Soyoutake a wordanditassociatesatwithan n dimensionalvector, Uh, thathascomponentsthatwillbe, um, nonperpendiculartootherwordsinyourdictionary.
Sowhatthatmeansisthatyoucanexpresswordsintermsofeachother, whereasifyouseteachwordinyourdictionarytobe a basisfactorthereorthogonalandsothere's norelationshipbetweensomethinglikeKingandQueen, forinstance, whereaswiththeautoencoderrepresentation, uh, restwiththesorry, thewordabettingrepresentation.
Itisthe, umithas a nonzerocomponentofonevectoralonganother.
Soyouhavesomerelationshipbetweenwordsthatallowsyoutoparsemeaningofyourstringoftext, and I give a betterexplanationofmypreviousvideos.
Socheckthatout, uh, foryourowneducation.
Sowe'regonnaneed a coupleofglobalvariablesabovehersize 10,000 a batchsizefortrainingandsomepaddedshapes.
Andthisisforpatting.
Sowhenyouhave a stringofwords, thestringofwordscouldbedifferentlengths.
Soyouhavetopadtothelengthofthelongestreview, basically, andthatisbatsizeby M.
D.
Sothenextthingwe'llneedisouractualdataset.
We'regonnashuffleitbecausewe'regooddatascientists, andwe'regonnawanttoget a paddedbatchfromthatintheshapedefinedwiththevariableaboveonthetestdatasetisverysimilar.
AndthenwewillsayTFKeiraStottlayersonbettingandcoderdotVocabsize 64 TFcareUSplayersBidirectional, uh, t f parislayersdot l s t m 64 toparentheses.
Denseon.
Thatis 64 with a rally.
Youactivation.
If I couldeverlearntotypeproperly, thatwouldbeveryhelpful.
Anotherdenselayerwithanoutputandthisoutputisgoingtoget a sigmoidactivation.
Andwhatthisrepresentsistthe e probabilityofthereviewbeingeitherpositiveornegative.
Sothefinallapofthemodel's gonnabe a floatingpointnumberbetweenzeroandone, anditwillbetheprobabilityofitbeing a positivereview, andwe'regonnapassin a coupleofdummyreviews, justsomekindofsoftballkindofstufftoseehowwellitdoes.
Butbeforethat, wehavetocompileourmodel.
Andwith a binarycrossentropyloss, Optimizerequals T F.
Thatprobablymeans I misspelledthenameofthedataset.
Allright, meScrollup a littlebit.
Ah, it's or I amDevireviews.
Okay, I amAllrightthere.
Dataset.
Youcan't.
You'rerightthere.
Okay, so I misspelledthenameofthedataset.
Not a problem, then.
T f sentiment.
Letus, uh, gouptohere.
I am D B, right.
Quitandgiveitanothershot.
I misspelled.
Dense.
Okay.
Canyouseethat?
No, notquite.
Uh, itsayshere, movemyselfover.
Hasnoattributesdense, Solet's fixthat.
That's inline 24 on 24.
Insertand s quitandtryagain.
There.
Nowitistrainingforfiveeatbox.
I amgonnaletthisrideandshowyoutheresultswhenitisdonereallyquick, youcanseethatitgivesthisfunnyerror.
Letmegoaheadandmovemyfaceoutoftheway.
Nowthis I keepseeinginthetensorflowtostop.
So a SZfaras I cantell, thisisrelatedtotheversionoftensorflow.
Thisisn't something I'm doingoryou'redoing.
There's anopenissueongethub.
And, uh, previouslyitwouldrunthaterroreverytime I trainedwitheveryepoch.
However, afterupdating, I thinktensorflow 2.1.
Itonlydoesafterthe 1st 1 so I guessyougain a littlebitthere.
Butitisdefinitely.
Butit's definitelyanissuewithTensorflow, so I'm nottooworriedaboutthat.
Solet's goinonthistrain, allright?
Soithasfinishedrunningand I haveteleportedtothetoprightsoyoucanseetheaccuracyandyoucanseeaccuracystartsoutlowandendsuparound 93.9% nottooshabbyforjustfive e poxon a verysimplemodel.
Ofcourse, youneedanembeddinglayertostartandcoderdayvocabsize 64 Let's movemymugLikesoAndatournextlayer, whichiscaresslayersbidirectional l s t m 64 returntrueAnd I amwaytoofarover E thatisstillwe'rejustgonnahavetolivewithit.
It's justgonnabebadcode.
Notuptothepepeightstandards, butwhatever.
Assumingbidirectionalhello s t m 32 caresslayersthoughtthatdensein 64 with a valueactivationandtopreventoverfitting, wearegonnaaddin a littlebitofdropoutjust 0.5 50% Andatourfinalclassificationslayerwith a sigmoidactivationmodel.
Soprettygoodimprovementwith a they, youknow, somewhatmorecomplicatedmodelandattheexpenseofslightlylongertraining.
So, youknow, 87 secondsissupposedto 47 seconds.
Sohonorsmysixminutesissupposedtothree.
Nottoobad.
Soanyway, sowhatwehavedonehereisloaded.
A seriesof I'm tobereviewsusedtotrain a modeltodosentimentpredictionbylookingatcorrelationsbetweenthewordsandthelabelsforeitherpositiveornegativesentimentandthenaskingthemodeltopredictwhatthesentimentof a obviouslypositiveandsomewhatlukewarmreviewwas.
Andwegetprettygoodresultsin a veryshortamountoftime.
Thatisthepoweroftensoflowtwopoint.
Oh, so I thankyouforwatchinganyquestionscommon. 00:59:12.770