Data Analyst @Barclays | MS in Data Science and Analytics @Georgetown University | nikitagpardesi@gmail.com
|Highly motivated and passionate about Data Science with extensive experience in data mining, analysis, modeling and app deployment on cloud. Always looking for opportunities to explore applications of Data Science and Analytics in different fields. An avid traveler.
Data Analyst
• Developing extensive automated data pipelines in Python with REST API integration through a Flask frontend, employing HTML and JavaScript for authentication, data extraction, ingestion, and cleaning processes. These scripts, accessible to both developers and team members, led to a substantial decrease - around 60% - in time and day-to-day efforts.
• Constructed comprehensive Tableau dashboards to analyze user and stakeholder interaction data, aiding in informed business decision-making involving key open source technologies such as Hadoop.
• Collaborated with various teams and individual business users to comprehend data requirements, subsequently developing customized automation scripts in Python.
• Implemented automation for software and package installation (LINUX) on AWS using CHEF automation. Collaborated with seasoned CHEF developers to integrate compliance and security policies.
• Experience in working with and comprehending the SAP BusinessObjects workflow, databases, security measures, universes, connections, etc. Developed automated Python scripts tailored for data collection, cleaning, and analysis, primarily concentrating on user, report level data and monitoring of services in a cluster of nodes.
• Actively involved in Machine Learning sessions in the bank.
Data Management Assistant
• Collaborated with the data research team to extract, clean, and store Chicago crime data, utilizing Python and GCP BigQuery SQL.
• Collaborated on the development and evaluation of data dashboards using Microsoft Looker.
• Took charge of designing comprehensive dashboard mockups and developed a web-based dashboard using Plotly DASH.
Machine Learning Engineer Intern
• Developed ML algorithms- AdaBoost, XGBoost, Elasticnet, Random Forest, Stochastic Gradient Descent, SVM, ARIMA, SARIMA, Exponential Smoothing in Python and R. Apply Hyperparameter tuning.
• Provide users with Model interpretability and explainability features using InterpretML in Python.
• Create docker images for training and testing the models in real time on different platforms and OS. Deploy models on AWS using SageMaker, Lambda and EC2.
Data Science Executive
• Building R&D dashboard for Reckitt – data collection, creating APIs in Flask and Django, database design and management, custom BERT models for automated categorization with a focus on News and Text Analytics (Natural Language Processing).
• Market sizing forecast models in Python, Social media Sentiment analysis.
• Hands-on experience with Atlassian Jira and Confluence for Agile Project Management.
• FMCG sector analysis (region and segment wise)- visualizations through insights, financial analysis in Python.
• Automated Company Earnings Call transcript’s collection, indexing and searching, summarization, text analytics and visualization. Lead a team of 10 data science interns.
MS in Data Science and Analytics
GPA: 3.78/4
Awarded Returning Student Scholarship
Coursework: Introduction to Data Science and Analytics, Statistics and Probability,
Optimization, Big Data and Cloud Computing, Statistical Inference, Advanced Data Visualization,
Natural Language Processing, Neural Networks, Money Banking and Financial Markets
Positions of Responsibility : Teaching Assistant – Machine Learning Application
Deployment, Advanced Math and Stat Computing, Blockchain Technologies | Student Ambassador |
CSET- Data Annotator Research Assistant
B.E. Electronics and Telecommunication
GPA: 8.24/10
Relevant Coursework: Database Management System (DBMS) & Big Data and Cloud Computing
Python: Data Collection- BeautifulSoup, Selenium, Scrapy; Data Cleaning and Pre-Processing- Numpy, Pandas, Pydantic, Cerberus, PySpark; Data storage- SQL, MySQL, NoSQL, HiveQL, MongoDB, csv, Parquet, Avro, JSON/YAML; Data Modeling- sklearn, PyTorch, Scipy, AutoML, Tensorflow, InterpretML; Rest API/ Web services- Flask, Django; Multiprocessing; Apache Spark; Data Visualization- Plotly, Dash
R, HTML, CSS, HighCharts JS, Natural Language Processing (NLTK, BERT, Hugging Face), Bash, Ruby
CHEF Automation, NLTK, SpaCy, Gensim, TensorFlow, Keras, Django, Flask, Apache Spark, Spark SQL, Hadoop MapReduce, HiveQL, AutoML, Scipy, Multiprocessing, InterpretML, LIME, Shapley
AWS: EC2, Lambda, S3, Cloud9, Sagemaker, EMR, Hadoop User Experience (HUE), Kinesis, Kafka
GCP:BiqQuery
SAP BusinessObjects, Kubernetes, Docker, GIT, PostgreSQL, MySQL, SQLite, Tableau, Microsoft Looker, Power BI, Atlassian- Jira, Confluence, UiPath, Automation Anywhere
A data viz mini project using Python and Highcharts. View Code.
View projectAnalyzing subreddit with Apache Spark and other Big Data Tools
View project