Designed and optimized ETL pipelines with Spark and AWS for high-performance data workflows
• Implemented data transformations using Python and PySpark for handling multi-gigabyte datasets.
• Managed Hadoop data lakes and S3 storage for scalable business solutions.
• Configured EMR clusters and EC2 instances to process large datasets efficiently.
• Collaborated with data teams to deliver insights and innovative solutions.
• Ensured data quality through profiling, validation, and pipeline reliability checks.
• Resolved performance issues and enhanced AWS pipeline scalability.
• Documented best practices and contributed to team standards for robust data systems
• Directed planning, scheduling, budgeting, and coordination of construction projects for on-time delivery.
• Conducted land surveys using AutoCAD and ArcGIS to create price maps and design layouts.
• Managed 10 construction projects worth $3.7M, meeting timelines and quality standards.
• Led the Air Force's largest paint striping project, marking 4.6M square feet to enhance safety and efficiency.
Developed a chrome plugin leveraging SAML to implement a multi-login system, initiating sessions from KidTec’s website and
automating logins to learning platforms such as Code.org and Flow Labs.
● An ecommerce store for products , through this system customers are being provided with the options of purchasing goods and services directly from the seller and sellers looking for their business continuity through the online platforms can make their products more attractive as they can provide various details, all in a real-time environment
● It can display all the available categories for shopping on the home page along with their subcategories that are associated with particular items
● Admin has the authority and permission to add new particulars, update the product description and remove items whenever needed, as well as modify the price of each item if required
Environment: HTML, CSS, JavaScript, Angular, Java, SpringBoot, JPA, Microservices, MySQL, Docker, Maven, GIT and GITHUB
Designed and implemented pyTLEX, a Python-based translation of the TimeLine EXtraction (TLEX) algorithm from its Java counterpart (jTLEX)
Developed key features, including a React-based visualization system for exploring temporal data, algorithms for enhancing temporal graph connectivity, and a built-in validation system ensuring compliance with TimeML guidelines.
Validated pyTLEX's accuracy and reliability through extensive testing on diverse datasets, delivering a robust toolkit for researchers and developers.