Introduction to AI & DS Lab
B. Tech Minor Course, Class: ECE Semester: iV, Regulation: R23
B. Tech Minor Course, Class: ECE Semester: iV, Regulation: R23
B.Tech-AI&DS 23ADM2- Introduction To Artificial Intelligence and Data Science Lab L T P Cr. 0 0 3 3Â
Pre-requisite: Knowledge of Computer fundamentals & Data structures& Algorithms Course EducationalÂ
Objective: The objective of the course is to provide a strong foundation of fundamental concepts in Artificial Intelligence and a basic exposition to the goals and methods of Artificial Intelligence and also provide fundamentals of Data Science.Â
Course Outcomes: At the end of this course,Â
CO1 : Apply the basic principles of AI in problem solving using Python (Apply – L3)Â
CO2 : Implement different algorithms using Python(Apply – L3)Â
CO3 : Perform various operations using numpy and pandas(Understand - L2)Â
CO4 : Improve individual / teamwork skills, communication & report writing skills with ethical values.Â
List of Experiments (Artificial Intelligence)Â
1. Implementation of DFS for water jug problem using pythonÂ
2. Implementation of BFS for tic-tac-toe problem using pythonÂ
3. Implementation of Hill-climbing to solve 8- Puzzle Problem using pythonÂ
4. Implementation of Monkey BananaÂ
Problem using PROLOG List of Experiments (Data Science)Â
1. Creating a NumPy ArrayÂ
a. Basic ndarrayÂ
b. Array of zerosÂ
c. Array of onesÂ
d. Random numbers in ndarrayÂ
2. The Shape and Reshaping of NumPy ArrayÂ
a. Dimensions of NumPy array
b. Shape of NumPy arrayÂ
c. Size of NumPy arrayÂ
d. Reshaping a NumPy arrayÂ
3. Indexing and Slicing of NumPy ArrayÂ
a. Slicing  1-D NumPy arraysÂ
b. Slicing  2-D NumPy arraysÂ
c. Slicing 3-D NumPy arraysÂ
d. Negative slicing of NumPy arraysÂ
4. Perform following operations using pandasÂ
a. Creating data frameÂ
b. concat()Â
c. Adding a new columnÂ
5. Read the following file formats using pandasÂ
d. Text filesÂ
e. CSV filesÂ
f. Excel filesÂ
g. JSON filesÂ
6. Perform following visualizations using matplotlibÂ
h. Bar GraphÂ
i. Pie ChartÂ
j. Box PlotÂ
k. HistogramÂ
l. Line Chart and SubplotsÂ
What is the Initial State? Usually (0, 0), representing both jugs are empty.
What is the Goal State? A state where Jug 1 or Jug 2 contains a specific amount $G$, e.g., (2, y) or (x, 2).
Why use a Stack? DFS follows a Last-In-First-Out (LIFO) structure. A stack allows the algorithm to dive deep into one branch before backtracking.
Is DFS optimal for this? No. DFS finds a solution, but it might be much longer than the shortest possible sequence of pours.
How do you represent "Pouring"? If pouring from Jug $A$ to Jug $B$: $Amount = \min(\text{Jug } A, \text{Capacity } B - \text{Jug } B)$.
What is the "Visited" set? A Python set() that stores previously explored tuples to prevent infinite recursion/loops.
How do you handle the 8-gallon/5-gallon math? A solution exists only if the goal amount is a multiple of the Greatest Common Divisor (GCD) of the two jug capacities.
What is the Time Complexity? $O(V + E)$, where $V$ is the number of possible states $(Capacity_1 \times Capacity_2)$.
What is the Space Complexity? $O(V)$ to store the visited states and the recursion stack.
Can you solve this with BFS? Yes, and BFS would guarantee the minimum number of steps.
Why use BFS? To explore all possible moves at the current level before moving deeper, ensuring we find the shortest path to a win.
What is the Board Representation? A 1D list of 9 strings [' ', 'X', 'O'...] or a 2D $3 \times 3$ matrix.
How does the Queue work? You use collections.deque. You popleft() the current board, generate all 1-move variations, and append() them.
What is the Branching Factor? It starts at 9 (first move), then 8, then 7, etc.
How many total states exist? Mathematically $3^9$ ($19,683$), but many are unreachable or terminal.
How do you check for a win? Check 8 lines: 3 horizontal, 3 vertical, and 2 diagonal.
What is a "Terminal State"? A state where a player has won or the board is full (Draw).
Difference between BFS and Minimax? BFS explores the state space blindly; Minimax uses a scoring system to choose the best move for a player.
What is a "Successor Function" here? A function that finds all empty indices and returns new board objects with the current player's mark in those spots.
Is BFS memory-intensive? Yes, because it stores all nodes at the current depth in the queue.
What is Hill Climbing? A mathematical optimization technique that moves "uphill" (towards a better heuristic value) until it reaches a peak.
What is a Heuristic? An estimate of the cost to reach the goal from the current state.
What is Manhattan Distance? The sum of the vertical and horizontal distances of tiles from their goal positions.
What is Misplaced Tiles? A count of how many tiles are currently in the wrong spot compared to the goal.
What is a "Local Maxima"? A state better than all its neighbors but not the actual goal.
What is a "Plateau"? A flat area of the search space where all neighboring states have the same heuristic value.
How do you move the "Blank" tile? By swapping the 0 (blank) with its neighbors (Up, Down, Left, Right).
Is Hill Climbing "Informed"? Yes, because it uses a heuristic to guide the search.
What is the "Greedy" property? It only cares about the immediate best neighbor, not the long-term path.
How do you fix getting stuck? Use "Random Restarts" or "Stochastic Hill Climbing" (choosing moves at random proportional to their quality).
What is the State Space size? $9! / 2 = 181,440$ reachable states.
Why $9! / 2$? Only half of the 8-puzzle configurations are solvable due to parity constraints.
What is the goal state? Usually [1, 2, 3, 4, 5, 6, 7, 8, 0].
Does Hill Climbing use a Queue? No, it only keeps track of the "Current" and "Next" state.
Space Complexity of Hill Climbing? $O(1)$ (not counting the board itself), making it very memory efficient.
What is the "Task"? A monkey in a room needs to reach a banana hanging from the ceiling by moving and climbing a box.
Define the State? A 4-tuple: (Monkey_Pos, Box_Pos, On_Box, Has_Banana).
What is an "Action"? A function that changes the state (e.g., PushBox, Climb, Grasp).
Action: "Push"? Precondition: Monkey and Box are at the same position. Effect: Both move to a new position.
Action: "Climb"? Precondition: Monkey and Box are at the same position. Effect: On_Box becomes True.
Action: "Grasp"? Precondition: Monkey is on the box and the box is under the banana. Effect: Has_Banana becomes True.
What is a "Precondition"? Conditions that must be true for an action to be performed.
Is this a "Planning" problem? Yes, it requires a sequence of actions to reach a goal.
What is Means-Ends Analysis? Looking at the difference between current and goal states and picking an action to reduce it.
How do you implement this in Python? Usually via a recursive search or a simple state-machine.
What if the box is already under the banana? The "Push" action is skipped, and the monkey proceeds to "Climb".
What are "post-conditions"? The changes that occur in the environment after an action.
Is the environment "Static" or "Dynamic"? Static, because nothing changes unless the monkey acts.
What is a State Transition? The process of moving from State A to State B via an Action.
Why is this a classic AI problem? It demonstrates how machines can reason about physical objects and causality.
What is NumPy?
Answer: NumPy (Numerical Python) is a library for scientific computing that provides support for large, multi-dimensional arrays and matrices, along with high-level mathematical functions.
How do you install and import NumPy?
Answer: Install via pip install numpy. Import using import numpy as np.
What is an ndarray?
Answer: It stands for "N-dimensional array," the core object of NumPy. It is a table of elements (usually numbers), all of the same type.
Why is NumPy faster than Python Lists?
Answer: NumPy arrays are stored in contiguous memory locations, allow vectorization (no explicit loops), and are implemented in C.
What does "Homogeneous" mean in NumPy?
Answer: Every element in the array must be of the same data type (e.g., all integers or all floats).
How do you create a 1D array from a list?
Answer: arr = np.array([1, 2, 3]).
How do you create a 2D array?
Answer: arr = np.array([[1, 2], [3, 4]]).
What is the default data type of a NumPy array?
Answer: It depends on the input, but usually int64 or float64 on 64-bit systems.
What is the difference between np.array() and np.asarray()?
Answer: np.array() creates a copy of the object by default, while np.asarray() does not create a copy if the input is already an ndarray.
What is a scalar in NumPy?
Answer: A 0-dimensional array containing a single value.
What does the .ndim attribute tell you?
Answer: It returns the number of axes (dimensions) of the array.
What is ndim for a scalar?
Answer: 0.
What is ndim for a vector?
Answer: 1.
What is ndim for a matrix?
Answer: 2.
How can you create an array with a specific number of dimensions immediately?
Answer: Using the ndmin argument: np.array([1, 2], ndmin=5).
What is an "Axis" in NumPy?
Answer: Axes are the dimensions. Axis 0 is usually rows (downward), and Axis 1 is columns (across).
How many axes does a 3D array have?
Answer: 3 (Axis 0, 1, and 2).
In a 2D array, what does axis=0 represent?
Answer: The vertical axis (rows).
In a 2D array, what does axis=1 represent?
Answer: The horizontal axis (columns).
Can an array have 0 dimensions?
Answer: Yes, it is a single point (scalar).
What does the .shape attribute return?
Answer: A tuple of integers representing the size of the array in each dimension (e.g., (rows, cols)).
What is the shape of a 1D array with 5 elements?
Answer: (5,).
What is the shape of a 2x3 matrix?
Answer: (2, 3).
How does .shape differ from .ndim?
Answer: ndim is a single integer (number of axes); shape is a tuple (length of each axis).
What does the .size attribute return?
Answer: The total number of elements in the array.
How is .size calculated mathematically?
Answer: It is the product of the elements in the .shape tuple.
If an array has shape (3, 4, 2), what is its size?
Answer: $3 \times 4 \times 2 = 24$.
Does changing the shape change the size?
Answer: No, the total number of elements must remain the same when reshaping.
What is .itemsize?
Answer: It returns the length of one array element in bytes.
How do you find the total memory consumed by an array?
Answer: arr.size * arr.itemsize or arr.nbytes.
What is the .reshape() method?
Answer: It changes the shape of an array without changing its data.
Can you reshape a 1D array of 6 elements into a 2x2 matrix?
Answer: No, because $2 \times 2 = 4$, and we have 6 elements. The sizes must match.
What is the result of arr.reshape(1, -1)?
Answer: It converts a 1D array into a 2D row vector.
What does the parameter -1 do in reshape()?
Answer: It is a placeholder that tells NumPy to automatically calculate the correct dimension for that axis.
Can you use multiple -1 in a single reshape() call?
Answer: No, you can only use one "unknown" dimension.
What is "Flattening" an array?
Answer: Converting a multi-dimensional array into a 1D array.
Name two ways to flatten an array.
Answer: arr.flatten() and arr.ravel().
What is the difference between flatten() and ravel()?
Answer: flatten() returns a copy (slower, uses more memory); ravel() returns a view (faster, changes to ravel affect the original).
What is a "View" in NumPy?
Answer: A view is just a different way of looking at the same memory buffer. Changing a view changes the original array.
Does reshape() return a copy or a view?
Answer: It usually returns a view, provided the memory is contiguous.
What is "Broadcasting"?
Answer: A rule that allows NumPy to perform arithmetic operations on arrays of different shapes.
What does np.expand_dims() do?
Answer: It inserts a new axis at a specified position to increase the dimension of the array.
What does np.squeeze() do?
Answer: It removes axes of length 1 (e.g., changes shape (1, 5) to (5,)).
What is the Transpose of an array?
Answer: Flipping the array over its diagonal (switching rows and columns) using arr.T.
What is the difference between resize() and reshape()?
Answer: reshape() returns a new array and requires identical size. resize() modifies the original array and can fill or truncate data if the size changes.
How do you check the data type of an array?
Answer: Using arr.dtype.
What happens if you try to reshape a 1D array of 10 elements into (3, 3)?
Answer: It raises a ValueError because $3 \times 3 \neq 10$.
What is the order parameter ('C' or 'F') in reshape?
Answer: 'C' means C-style (row-major memory), and 'F' means Fortran-style (column-major memory).
How do you swap two axes?
Answer: Using np.swapaxes(arr, axis1, axis2).
How do you join two arrays?
Answer: Using np.concatenate(), np.hstack(), or np.vstack()
What is indexing in NumPy?
Answer: Indexing is the process of accessing a specific element in an array using its position (index) starting from 0.
What is the difference between Python list indexing and NumPy indexing?
Answer: While similar, NumPy supports multi-dimensional indexing (e.g., arr[1, 2]) which is more efficient than the list-of-lists style (list[1][2]).
What does arr[0] return for a 2D array?
Answer: It returns the first entire row of the array.
What is "Integer Array Indexing"?
Answer: Using a list or another array of integers as indices to pick specific elements (e.g., arr[[0, 2, 4]]).
What is "Boolean Indexing"?
Answer: Using a condition to filter elements (e.g., arr[arr > 5]).
How do you access the last element of a 1D array?
Answer: Using arr[-1].
What happens if you use an index that is out of bounds?
Answer: It raises an IndexError.
Can you change a value using indexing?
Answer: Yes, by assigning a value: arr[2] = 10.
What is the syntax for accessing an element in a 2D array?
Answer: arr[row_index, column_index].
How do you find the index of the maximum element?
Answer: Using np.argmax(arr).
What is the general syntax for slicing?
Answer: arr[start:end:step].
Does the end index include the element at that position?
Answer: No, the slice is "up to but not including" the end index.
What is the default value of start if omitted?
Answer: 0.
What is the default value of end if omitted?
Answer: The length of the array (includes everything to the end).
What does arr[1:5] do?
Answer: Returns elements from index 1 to index 4.
How do you get every second element of a 1D array?
Answer: arr[::2].
How do you reverse a 1D array using slicing?
Answer: arr[::-1].
What does arr[:] return?
Answer: A copy/view of the entire array.
What is the result of arr[:3]?
Answer: Elements from the beginning up to index 2.
What is the result of arr[4:]?
Answer: Elements from index 4 to the very end.
How do you slice rows and columns simultaneously in 2D?
Answer: arr[row_slice, col_slice].
How do you extract the first two rows and all columns?
Answer: arr[0:2, :].
How do you extract the third column of every row?
Answer: arr[:, 2].
How do you get the bottom-right $2 \times 2$ sub-matrix?
Answer: arr[-2:, -2:].
What does arr[1, 1:3] return?
Answer: Elements at index 1 and 2 of the second row (index 1).
How is a 3D array indexed?
Answer: arr[depth_index, row_index, col_index].
What does arr[0, :, :] return in a 3D array?
Answer: The first entire 2D "page" or "slice" of the 3D block.
How do you slice across all depths to get the same row from every matrix?
Answer: arr[:, row_index, :].
What is the shape of the result of arr[0:1, 0:1] vs arr[0, 0]?
Answer: arr[0:1, 0:1] returns a 2D array; arr[0, 0] returns a single scalar value.
How do you select the middle element of a $3 \times 3 \times 3$ array?
Answer: arr[1, 1, 1].
What does a negative index represent?
Answer: Counting from the end of the array (e.g., -1 is the last, -2 is second last).
What does arr[-3:-1] return?
Answer: Elements from the 3rd last up to (but not including) the last.
How do you get the last 3 elements of a 1D array?
Answer: arr[-3:].
In a 2D array, what does arr[-1, :] represent?
Answer: The very last row.
How do you use negative steps in slicing?
Answer: arr[5:2:-1] selects elements from index 5 down to 3.
What happens if start is less than end with a negative step?
Answer: It returns an empty array (e.g., arr[2:5:-1]).
How do you slice the last column of a 2D array?
Answer: arr[:, -1].
What does arr[:-1] do?
Answer: Returns all elements except the last one.
How do you get the last two elements of the last two rows in 2D?
Answer: arr[-2:, -2:].
How do you skip the last element while reversing?
Answer: arr[-2::-1].
Is a slice a "copy" or a "view" in NumPy?
Answer: It is a view. Modifying the slice will modify the original array.
How can you create a slice that is a "copy" instead of a view?
Answer: Use the .copy() method: sub_arr = arr[0:5].copy().
What is the "Ellipsis" (...) in slicing?
Answer: It represents as many colons as needed to select full dimensions. In 3D, arr[..., 0] is the same as arr[:, :, 0].
What does np.newaxis do during slicing?
Answer: It increases the dimension of the slice by one (e.g., turning a 1D slice into a 2D column).
What is "Fancy Indexing"?
Answer: Passing an array of indices to access multiple non-contiguous elements at once.
How do you select elements at (0,0), (1,2), and (2,1) in a 2D array?
Answer: arr[[0, 1, 2], [0, 2, 1]].
Does Fancy Indexing return a view or a copy?
Answer: Unlike basic slicing, Fancy Indexing returns a copy.
How do you combine boolean masking with slicing?
Answer: arr[arr > 0][0:2] (finds positive numbers and takes the first two).
What is the difference between arr[0, 0] and arr[(0, 0)]?
Answer: There is no difference; both access the element at row 0, column 0.
How do you slice an array based on a list of row indices?
Answer: arr[[0, 3, 5], :]
What is a Pandas DataFrame?
Answer: A 2-dimensional, size-mutable, and heterogeneous tabular data structure with labeled axes (rows and columns).
How do you import Pandas?
Answer: import pandas as pd.
What is the difference between a Series and a DataFrame?
Answer: A Series is a 1D array with labels; a DataFrame is a 2D table composed of multiple Series (columns).
How do you create a DataFrame from a Dictionary?
Answer: pd.DataFrame({'Name': ['A', 'B'], 'Age': [20, 25]}).
How do you create a DataFrame from a List of Lists?
Answer: pd.DataFrame([[1, 2], [3, 4]], columns=['A', 'B']).
How do you create a DataFrame from a NumPy array?
Answer: pd.DataFrame(np_array, columns=['Col1', 'Col2']).
What happens if you create a DataFrame from a dictionary of unequal list lengths?
Answer: It raises a ValueError because all arrays must be of the same length.
How do you specify custom row indices during creation?
Answer: Use the index parameter: pd.DataFrame(data, index=['R1', 'R2']).
How do you read a CSV file into a DataFrame?
Answer: df = pd.read_csv('filename.csv').
How do you read an Excel file?
Answer: df = pd.read_excel('filename.xlsx').
What does the .head() method do?
Answer: Returns the first 5 rows of the DataFrame (can be customized, e.g., df.head(10)).
What does the .tail() method do?
Answer: Returns the last 5 rows.
How do you check the data types of all columns?
Answer: Using df.dtypes.
How do you get a summary of the DataFrame (index, columns, non-null counts)?
Answer: Using df.info().
What does df.describe() provide?
Answer: Statistical summary (mean, std, min, max, quartiles) for numeric columns.
What is pd.concat()?
Answer: A function used to combine two or more Pandas objects (Series or DataFrames) along a particular axis.
What is the default axis for concat()?
Answer: axis=0 (Vertical concatenation/stacking rows).
How do you concatenate DataFrames horizontally?
Answer: pd.concat([df1, df2], axis=1).
What does the ignore_index=True parameter do?
Answer: It discards the original index and creates a new continuous integer index (0, 1, 2...).
What happens if column names don't match during vertical concatenation?
Answer: Pandas aligns matching columns and fills non-matching ones with NaN (Not a Number).
What is the difference between join='outer' and join='inner' in concat()?
Answer: outer (default) keeps all columns from both; inner only keeps columns common to both.
What is the keys parameter in concat()?
Answer: It creates a MultiIndex, allowing you to identify which part of the data came from which original DataFrame.
Can you concatenate a Series and a DataFrame?
Answer: Yes, it will add the Series as a new row or column depending on the axis.
How is concat() different from merge()?
Answer: concat simply stacks data; merge joins data based on common keys (similar to SQL JOIN).
How is concat() different from append()?
Answer: append is a specific case of concat (axis=0) and is now deprecated in newer Pandas versions in favor of concat.
How do you add a new column with a default value?
Answer: df['New_Col'] = 0.
How do you add a new column based on an existing column?
Answer: df['Double_Age'] = df['Age'] * 2.
What is the .insert() method?
Answer: It allows you to add a column at a specific index location: df.insert(loc, 'Name', values).
How do you add a column using the .assign() method?
Answer: df = df.assign(New_Col=[1, 2, 3]). (Note: This returns a new DataFrame).
What happens if you assign a list of 5 values to a DataFrame with 10 rows?
Answer: It raises a ValueError because the lengths must match.
How do you add a column at the very beginning of a DataFrame?
Answer: df.insert(0, 'First_Col', values).
Can you add a Series as a new column?
Answer: Yes, Pandas will align the Series index with the DataFrame index.
How do you rename an existing column?
Answer: df.rename(columns={'Old': 'New'}, inplace=True).
How do you delete a column?
Answer: del df['ColumnName'] or df.drop('ColumnName', axis=1, inplace=True).
How do you check if a column exists before adding it?
Answer: if 'Col' in df.columns:.
What is the difference between .loc and .iloc?
Answer: .loc is label-based (uses names); .iloc is integer-position based (uses numbers).
How do you select a single column?
Answer: df['ColumnName'] or df.ColumnName.
How do you select multiple columns?
Answer: df[['Col1', 'Col2']] (Note the double brackets).
What is the shape of a DataFrame?
Answer: df.shape returns a tuple (rows, columns).
How do you find the number of rows?
Answer: len(df) or df.shape[0].
What is inplace=True?
Answer: It modifies the existing DataFrame directly instead of returning a new one.
How do you handle missing values (NaN) in a column?
Answer: Using df.fillna(0) or df.dropna().
How do you change the data type of a column?
Answer: df['Col'] = df['Col'].astype('float').
How do you find unique values in a column?
Answer: df['Col'].unique().
What does df['Col'].value_counts() do?
Answer: Returns the count of each unique value in that column.
How do you sort a DataFrame by a column?
Answer: df.sort_values(by='ColName', ascending=False).
What is "Broadcasting" in Pandas?
Answer: Applying an operation (like + 5) to an entire column at once.
How do you set a specific column as the index?
Answer: df.set_index('ColName', inplace=True).
How do you reset the index?
Answer: df.reset_index(drop=True, inplace=True).
Why is Pandas preferred over standard Python for data analysis?
Answer: It is built on NumPy, providing high performance, easy handling of missing data, and powerful alignment and grouping features.
What is the most common function to read a text-based file?
Answer: pd.read_csv(). Despite the name, it is used for almost all delimited text files.
What is a "Delimiter" or "Separator"?
Answer: A character (like a comma, tab, or semicolon) that separates individual data values in a file.
How do you specify a file path in Pandas?
Answer: By passing a string to the read function, e.g., pd.read_csv("C:/data/file.csv").
What is the difference between an absolute and relative file path?
Answer: An absolute path starts from the root (C:...); a relative path starts from the current working directory.
What happens if the file you are trying to read doesn't exist?
Answer: Python raises a FileNotFoundError.
How do you read only the first 10 rows of a very large file?
Answer: Use the nrows parameter: pd.read_csv('file.csv', nrows=10).
How do you skip the first 3 rows of a file while reading?
Answer: Use skiprows=3.
How do you specify which row should be the header?
Answer: Use the header parameter (e.g., header=0 for the first row).
What if your file has no header row?
Answer: Set header=None. Pandas will assign integer column names (0, 1, 2...).
How can you provide custom column names while reading?
Answer: Use the names parameter: pd.read_csv('file.csv', names=['Col1', 'Col2']).
What is the default separator for pd.read_csv()?
Answer: A comma (,).
How do you read a Tab-Separated Value (TSV) file?
Answer: pd.read_csv('file.txt', sep='\t').
How do you read a text file where the separator is a semicolon (;)?
Answer: pd.read_csv('file.txt', sep=';').
What does index_col do?
Answer: It allows you to set a specific column from the file as the DataFrame index.
How do you handle files with a different encoding (like Latin-1)?
Answer: Use the encoding parameter: pd.read_csv('file.csv', encoding='latin-1').
How do you read only specific columns from a CSV?
Answer: Use usecols, e.g., pd.read_csv('file.csv', usecols=['Name', 'Age']).
What is na_values?
Answer: A parameter used to define additional strings that should be recognized as NaN (e.g., na_values=['Missing', '??']).
What does low_memory=False do?
Answer: It forces Pandas to process the file in one go instead of chunks, which helps prevent "mixed type" warnings in large files.
How do you read a file that uses a variable number of spaces as a separator?
Answer: Use sep='\s+'.
Can Pandas read a CSV file directly from a URL?
Answer: Yes, just pass the URL string as the file path.
Which function is used to read Excel files?
Answer: pd.read_excel().
What library must be installed to read modern .xlsx files?
Answer: openpyxl.
How do you read a specific sheet by its name?
Answer: pd.read_excel('file.xlsx', sheet_name='Sheet2').
How do you read the second sheet by its index?
Answer: pd.read_excel('file.xlsx', sheet_name=1). (Index starts at 0).
How do you read ALL sheets from an Excel file at once?
Answer: pd.read_excel('file.xlsx', sheet_name=None). This returns a dictionary of DataFrames.
Can pd.read_excel() handle merged cells?
Answer: Yes, but it usually fills the first cell and marks others as NaN.
What if an Excel file has an empty first row?
Answer: Use skiprows=1 to start reading from the second row.
How do you limit the number of columns read from Excel?
Answer: Use the usecols parameter (e.g., "A:C" or [0, 2]).
What is the ExcelFile class in Pandas?
Answer: It is used for performance when reading multiple sheets from the same file to avoid re-opening the file repeatedly.
Does read_excel support passwords?
Answer: No directly; you usually need a separate library like msoffice-crypt to decrypt it first.
What does JSON stand for?
Answer: JavaScript Object Notation.
Which function reads JSON data?
Answer: pd.read_json().
What are the common "orient" values in read_json?
Answer: 'split', 'records', 'index', 'columns', and 'values'.
What is the 'records' orientation?
Answer: Data formatted as a list of dictionaries (e.g., [{col1: val1}, {col1: val2}]).
What is "Flattening" in the context of JSON?
Answer: Converting nested JSON (dictionaries inside dictionaries) into a flat table.
Which function is used for nested JSON?
Answer: pd.json_normalize().
Can you read JSON from a web API?
Answer: Yes, by passing the API URL to read_json().
What is the difference between a JSON object and a JSON array?
Answer: An object is a dictionary {}; an array is a list [].
How does Pandas handle missing keys in a JSON file?
Answer: It fills the missing values with NaN.
What is a "Line-delimited JSON" (JSONL)?
Answer: A file where each line is a separate JSON object. Read it with lines=True.
How do you check if your DataFrame was loaded correctly?
Answer: Use df.info() or df.head().
What does parse_dates do?
Answer: Automatically converts date-like strings into actual Python datetime objects.
How do you handle a file with a footer (rows at the end to ignore)?
Answer: Use skipfooter=N and engine='python'.
What is "Chunking"?
Answer: Reading a massive file in smaller pieces using the chunksize parameter to save memory.
What is the dtype parameter in read functions?
Answer: It allows you to force a column to be a certain type (e.g., dtype={'ID': str}).
What is the difference between read_csv and to_csv?
Answer: read_csv is for input (loading); to_csv is for output (saving).
Can Pandas read compressed files (like .zip or .gz)?
Answer: Yes, it automatically detects compression or you can specify compression='zip'.
What is the thousands parameter?
Answer: It handles numbers formatted with commas (e.g., "1,000") so they are read as integers.
How do you read a file with a comment character (like #)?
Answer: Use comment='#'. Pandas will ignore anything after the symbol.
Why is pd.read_parquet() often faster than pd.read_csv()?
Answer: Parquet is a binary, columnar format that is highly compressed and optimized for machine learning.
What is Matplotlib?
Answer: A comprehensive library for creating static, animated, and interactive visualizations in Python.
How do you import the plotting module?
Answer: import matplotlib.pyplot as plt.
What is the difference between a Figure and an Axes?
Answer: The Figure is the entire window/page where everything is drawn. The Axes is the specific area (subplot) where the data is plotted (the "plot" itself).
How do you add a title to a plot?
Answer: plt.title("Your Title").
How do you label the X and Y axes?
Answer: plt.xlabel("X-axis Name") and plt.ylabel("Y-axis Name").
What does plt.show() do?
Answer: It displays the generated figure on the screen.
How do you change the size of a plot?
Answer: plt.figure(figsize=(width, height)).
How do you save a plot as an image file?
Answer: plt.savefig("filename.png").
What is a Legend and how do you add it?
Answer: A legend identifies different data series. Use plt.legend() after giving plots a label argument.
How do you add a grid to the background?
Answer: plt.grid(True).
When should you use a Bar Graph?
Answer: To compare categorical data or show changes over time for discrete categories.
Which function creates a vertical bar chart?
Answer: plt.bar(x, height).
How do you create a horizontal bar chart?
Answer: plt.barh(y, width).
How do you change the color of the bars?
Answer: Using the color parameter: plt.bar(x, y, color='red').
What is a Pie Chart used for?
Answer: To show proportions or percentages of a whole.
Which function creates a Pie Chart?
Answer: plt.pie(data).
What does the autopct parameter do in a Pie Chart?
Answer: It displays the percentage value on each wedge (e.g., autopct='%1.1f%%').
How do you "pop out" a slice in a Pie Chart?
Answer: Use the explode parameter with a list of values.
How do you add labels to wedges in a Pie Chart?
Answer: Use the labels parameter: plt.pie(data, labels=my_labels).
How do you make a Pie Chart a perfect circle?
Answer: plt.axis('equal').
What is a Box Plot (Box-and-Whisker)?
Answer: A plot that shows the distribution of data based on a five-number summary: Minimum, First Quartile (Q1), Median, Third Quartile (Q3), and Maximum.
What does the line inside the "box" represent?
Answer: The Median of the dataset.
What are "Outliers" in a Box Plot?
Answer: Data points that fall far beyond the "whiskers" (usually $1.5 \times$ the Interquartile Range).
Which function creates a Box Plot?
Answer: plt.boxplot(data).
What is a Histogram?
Answer: A plot that represents the frequency distribution of a continuous numerical variable.
How is a Histogram different from a Bar Chart?
Answer: Histograms plot quantitative data (ranges/bins), while Bar Charts plot categorical data.
What are "Bins" in a Histogram?
Answer: Bins are the intervals into which the entire range of data is divided.
How do you change the number of bins?
Answer: plt.hist(data, bins=20).
What does density=True do in plt.hist()?
Answer: It normalizes the histogram so that the total area under the bars equals 1 (Probability Density).
How do you create a cumulative histogram?
Answer: plt.hist(data, cumulative=True).
When is a Line Chart most useful?
Answer: For showing trends over time (time-series data).
Which function creates a Line Chart?
Answer: plt.plot(x, y).
How do you change the line style (dashed, dotted)?
Answer: Use the linestyle parameter (e.g., --, :, -.).
How do you add markers (dots) to the data points on a line?
Answer: plt.plot(x, y, marker='o').
How do you plot multiple lines in one graph?
Answer: Call plt.plot() multiple times before calling plt.show().
What is a Subplot?
Answer: A way to place multiple plots in a single figure in a grid-like layout.
What is the syntax for plt.subplot()?
Answer: plt.subplot(nrows, ncols, index).
What does plt.subplot(2, 2, 1) mean?
Answer: A $2 \times 2$ grid of plots, and we are selecting the 1st plot (top-left).
What is plt.subplots() (plural)?
Answer: A more powerful function that returns both a Figure and an array of Axes objects.
How do you add a title to the whole figure (not just a subplot)?
Answer: plt.suptitle("Main Title").
What is the tight_layout() function?
Answer: It automatically adjusts subplot parameters so that labels and titles don't overlap.
How do you change the color map (cmap) in Matplotlib?
Answer: Using the cmap parameter (e.g., cmap='viridis').
What is the "Object-Oriented" style of plotting?
Answer: Creating objects explicitly: fig, ax = plt.subplots() and then using ax.plot().
How do you fill the area under a line chart?
Answer: plt.fill_between(x, y).
How do you create a Scatter Plot?
Answer: plt.scatter(x, y).
How do you add text at a specific coordinate on the plot?
Answer: plt.text(x, y, "Message").
What is a "Tick"?
Answer: The markers on the axes indicating specific values.
How do you rotate X-axis labels (useful for long names)?
Answer: plt.xticks(rotation=45).
What is the alpha parameter?
Answer: It controls the transparency of the plot (0 is transparent, 1 is opaque).
How do you clear a plot to start a new one?
Answer: plt.clf() (Clear Figure) or plt.cla() (Clear Axes).