Data Scientist, part of the core Data Science team, will be responsible for working with internal teams as well as the clients to identify business needs in terms of advanced analytics and develop advanced predictive/analytical models/solutions/tools using applied mathematics, statistical and machine learning methods/algorithms in order to help the client improve different areas in their respective businesses/domains. This profile requires strong apptitude for research, analytical thinking and understanding of different metrics for data-driven problem-solution approach.
A complete understanding and working of data science life-cycle is mandatory, which includes but not limited to data sourcing, data cleaning, data mining, exploratory data analysis, model building with different advanced analytics such as descriptive, predictive and prescriptive analytics as per domain and business requirements. This would require a keen interest in analyzing a large amount of data and thorough investigation to draw meaning out of it with proper diagnosis.
Must be an expert in different model building techniques such as time series, multivariate linear regression, different classification methods (logistic regression, decision tree, KNN, random forest, neural network, SVM, etc.) with a proper understanding of ensemble methods such as boosting/bagging. Should have a complete understanding of different clustering techniques, dimension reduction techniques such as principle component analysis, etc.
Should be proficient in some of the programming languages such as R/Python/Scala (with Spark).
Experience and key skills:
Knowledge of basic statistics probability theory (including PDFs, CDFs, PMFs), p-values, confidence intervals, measures of central tendency, measures of dispersion, skew & kurtosis, z-test, t-test (student test paired, unpaired), chi-square test, proportion-test, F-test, ANOVA, different distributions (normal distribution, bernoulli distribution, binomial distribution, poisson distribution, etc.), Null-hypothesis (type of errors), types of variables, etc. (Mandatory)
R/Python/scala with advanced packages/modules (Mandatory)
Machine Learning Algorithms such as Multivariate Regression, Classifications (logistic regression, decision tree, random forest, k-nearest neighbor, neural network, SVM, Naive Bayes, etc.) Clustering (K-Means, Hierarchical clsutering, Partition Around Mediods(PAM), etc.), Time Series (ARMA, ARIMA, Exponential Smoothing, etc.), Adaboost, Ensembles (Bagging / Boosting Techniques) (Mandatory)
Knowledge of different graphs box-plots, scatter-plots, histograms, pie-charts, bar-graph, etc. (Mandatory)
Data Mining and Text Mining (Mandatory)
knowledge/experience on big data platform such as Hadoop, Hive, Pig, Sqoop, Flume HBase, Scala (Added advantage but not mandatory)
Knowledge of any domain such as banking and finance, healthcare, supply-chain management, telecom/mobile, IT/ITes (added advantage but not mandatory)
When you call or mail don’t forget to mention that you found this ad on www.jobtechno.com
- Experience Required 1-3
- Company Name INFOFACES TECHNOLOGY SOLUTIONS PRIVATE LIMITED
- Contact Person Ganesh Gopalakrishnan
- Email ID email@example.com