Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

INDUSTRIAL INTERNSHIP

WEEKLY PERFORMANCE REPORT (WPR)

Student Name: Vikas Gupta


Supervisor Name: B. P. Mishra/ Shivani Mishra
Coordinator/Team Leader Name: Namira Rangrej
Mentor Name: Pranshu Sharma
Organization: CureYa
Hours Worked: Monday-2 hrs, Tuesday-2 hrs, Wednesday-1.5 hrs, Thursday-1.5 hrs, Friday- 3 hrs

Summarize your thoughts regarding your internship this week. Include duties you have performed,
facts, and procedures you have learned, skills you have mastered, and observations you have made.

Week-5 (24 May 28, 2021 to 28 May 2021)


Monday:
Data Visualization: It is the process of converting data into information in the form of a chart,
diagram, picture etc….to help the decision making in the mean time.
Data Visualization using QlikSense: QlikSense is the biggest game player in Business Intelligence
(BI) market operating from 1993 providing various BI services to around 1700 customers all around
the world every day.
There are 3 steps of visualization:
1. Extraction of data from various data sources,
2. Modeling: It’s the clean up part with outcome as a single table or a set of tables interlinked,
3. Visualization: it includes dimensions (columns) and metrics (operations).
What is the importance of Data Visualization? AND Why BI Visualization tools are required?
Answer:
1) Connect varieties of data sources, bring data into a single platform and perform
transformations.
2) Best Data Compression (40 MB storage can be compressed to 5MB storage)
3) In Memory (Improved Performance)
4) Data Associativity Feature (interact deeply with a particular filter feature)
5) Machine learning and deep learning integration (Mash-Up in QlikSense)
6) Easy to use
7) Accessibility over internet (host the application on server and share)
8) Embedded Analytics (bring visualization to websites)
9) Geo Analytics (dig data deep down geographically)
10) Integration of open source chart libraries like chart.js, D3.js etc.
QlikSense consists of bigdata data source connectors, flat files connectors, SQL database connectors,
and API connectors.
Who are the End Users:
DAR Structure (Dashboard, Analysis, Reports)
Dashboard: - For Higher Level Management only (Mentioning critical points only)
Analysis: - For Middle Level Management (Aggregate view, deep view of data)
Reports: - Low Level Management (Record by record transaction for TL etc.)
Top BI Tools:-
1) QlikView/ QlikSense (ETL and customization features), 2) Tableau, 3) Power BI, 4)
SiSense, 5) Kibana, 6) Microstrategy, 7) Birst, 8) TibcoSpotfire, 9) Looker
Qlik Installation: - Go to qlik.com/us→ Register on it→ Go to “Support”→ Go to
“Download”→ download the latest version (don’t forget to check ‘extras’)
QlikSense Hub: - Login→ Go to ‘desktop hub’→ click on “CreateNewApp” (it’ll be in qvf
format)
Now, open the app→ add data from files → load a csv file → add data→ go to “data load
editor” to check.
This is used to generate insights.
Go to ‘script editor’ to load data→ create new section by clicking Ꚛ→ Name the section→
create new connection→ select folder→ enter path→ name the connection→ select data
Edit connection→ insert data→ click on load data
App Overview→ Sheets, Bookmarks, Stories
Go to Sheets→ click on CreateSheet→ click on association
Data Model Viewer: interconnection of multiple tables
Creation of variables, barcharts, and add-ons. Change appearance such as title, footnote,
presentation, colors of bars.
Creating multiple charts: go to ‘Tables’→ go to ‘Data’→ click on “Add column”
Data Transformation Basics: -
Give appropriate table names (by default it’ll be the file name of ‘csv’.
Comment // to ignore a column.
Adding Filters: Make changes→ Save→ Load again→ Go to “Model Viewer”
Functions in scripts and expressions, refer help.Qlik.com for documentation.

Tuesday:
PyTorch Tutorial: -
PyTorch is developed by Facebook AI Research (FAIR) Lab. PyTorch has a C++ interface, thus,
it’s very fast. A number of features of deep learning software are built on top of PyTorch
including Tesla Autopilot, Uber’s Pyro, HuggingFace’s Tranformers, and PyTorch Lightening
& Catalyst.
From Research to Production: - An open source machine learning framework that
accelerates the path from research prototyping to production deployment. It is used for
Computer Vision and Natural Language Processing.
PyTorch provides 2 high-level features:
1) Tensor Computing (like NumPy) with strong acceleration via GPU
2) Deep Neural Networks built on a type-based automatic differentiation system
PyTorch Videos on:
 Installation of PyTorch for Deep Learning
 Understanding of Tensors: - A Tensor is a generalization of vectors and metrices,
and is easily understood as a multidimensional array. It is a term and set of
techniques known in machine learning in the training and operation of deep learning
models can be described in terms of tensors. In many cases, tensors are used as a
replacement for NumPy to use the power of GPUs.
Tensors are a type of data structure used in linear algebra, and like vectors and
metrices, you can calculate arithmetic operations with tensors.
 Back Propagation using PyTorch (Compute derivatives)
 Creating an ANN (Artificial Neural Network) using PyTorch
 Kaggle Advance House Price Prediction using PyTorch-Tabular Dataset
 How to use GPU to run PyTorch code

Wednesday:
 Revision of Python basics, Statistics, and machine learning algorithms.
 Installed Python package “covid” and done analysis of data by various methods and
functions.
 Hands-on experience with Pandas Profiling.
 Deep Learning Study (Neural Networks)

Thursday:
One-to-one discussion with Dr. Bajarang Mishra Sir on research paper: Discussed Artificial
Intelligence (AI) techniques: 1) Neural Network, 2) Fuzzy Logic, 3) Genetic Algorithms (GAs),
and 4) Hybrid method. I’ve done comparative study of all these techniques. I’ve also done
the comparative study of ANN (Artificial Neural Network), ANFIS (Adaptive neuro fuzzy
inference systems), CANFIS (Co-active neural fuzzy inference systems), and hybrid intelligent
systems.
Types of research papers: 1) Patent paper (topmost priority), 2) Transactions paper, 3)
Journal Paper, 4) International Conference paper.
Criteria for selection: 1) New and innovative ideas (priority), 2) Finding the best
(comparative study of technologies or algorithms), 3) Technical Review Paper (Crux of the
outcome of 20-25 research papers).
Top Publications: - 1) IEEE, US, 2) Elsevier, 3) Taylor & Francis
Most reputed Journal indexing services:
I. WOS (Web Of Science),
II. SCI (Science Citation Index)
a. ESCI (Emerging Source Citation Index)
b. SCIE(Science Citation Index Expanded)),
III. SCOPUS
Research Paper Study: searching of publishers, journals and topics on IEEE Xplore.

Friday:
Comparison of various machine learning algorithms:
I used Breast Cancer Winconsin (Diagnostic) Dataset for this task. The objective was to
predict whether the cancer is Benign or Malignant. I performed exploratory data analysis
(EDA) on this dataset and then compared the accuracy of various machine learning
algorithms. I found that Logistic Regression and KNN provided maximum accuracy while
Decision Tree and Naïve Bayes rendered lowest accuracy. The project was uploaded on my
GitHub profile and then it was posted on LinkedIn along with a video (screen recording of
the code) and a GitHub link (including tags of CureYa, Cureya Internship, all CureYa
individuals involved in this internship and related hashtags).
Student Signature: Vikas Gupta Date: 28/05/2021

Head Co-ordinator Signature: Date:

Instructions: After the completed report has been signed by both the student and Head-
coordinator, the head-coordinator shall scan the form to a pdf format and email it to the Director-
1 ([email protected]) of the company. Specific problems, concerns or suggestions from
either the student/head-coordinator should be emailed separately to the C. E. O. ([email protected])
of the company.

You might also like