Remove home tag GitHub
article thumbnail

End-to-end development lifecycle for data engineers to build a data integration pipeline using AWS Glue

AWS Big Data

Configure your Git repository with CodeCommit In an earlier step, you cloned the Git repository from GitHub. Although it’s possible to configure the AWS CDK template to work with GitHub, GitHub Enterprise, or Bitbucket, for this post, we use CodeCommit. aws:/home/glue_user/.aws

article thumbnail

How to tackle a real-world problem with GuidedLDA

Insight

Part of Speech (POS) Tagging After LDA, I decided to tag the part of speech (POS) for each reflection and extract the verbs. I parsed the reflection and extracted all of the verbs used in the reflections via POS tagging. Langdetect is pretty accurate when the input is a sentence, but less so when entering just a word.

Testing 54
article thumbnail

Backtesting index rebalancing arbitrage with Amazon EMR and Apache Iceberg

AWS Big Data

Major market indexes, such as S&P 500, are subject to periodic inclusions and exclusions for reasons beyond the scope of this post (for an example, refer to CoStar Group, Invitation Homes Set to Join S&P 500; Others to Join S&P 100, S&P MidCap 400, and S&P SmallCap 600 ).