Author: June Hong

Polymorphism in Object-Oriented Programming

1. What is Polymorphism

Polymorphism means “many forms”. “Poly” is a prefix meaning “many” and “Morphism” is derived from Greek roots meaning “form” or “shape”, and a suffix indicating a process or state.

Polymorphism allows

methods or objects to behave differently based on their context.
objects to be treated as instances of their parent class or interface rather than their actual class.

2. Key Concepts of Polymorphism in Java

Java achieves polymorphism through two main mechanisms

2.1. Compile-time polymorphism (Method Overloading) – Static Binding

Method overloading (not strictly polymorphism) allows multiple methods with the same name but different parameters in the same class. The correct method to be called is determined at compile time based on the number and types of arguments passed.

class MathOperations {
    // Overloaded method with two integers
    int add(int a, int b) {
        return a + b;
    }

    // Overloaded method with three integers
    int add(int a, int b, int c) {
        return a + b + c;
    }

    // Overloaded method with two doubles
    double add(double a, double b) {
        return a + b;
    }
}

public class Main {
    public static void main(String[] args) {
        MathOperations math = new MathOperations();
        // Calls int add(int, int)
        System.out.println(math.add(5, 10));
        // Calls int add(int, int, int)       
        System.out.println(math.add(5, 10, 20));
        // Calls double add(double, double)
        System.out.println(math.add(5.5, 2.5));    
    }
}

2.2. Runtime polymorphism (Method Overriding) – Dynamic Binding

Method overriding occurs when a subclass provides a specific implementation of a method that is already defined in its superclass.

The method in the subclass must have the same name, return type, and parameters as the method in the superclass.

The correct method to be called is determined at runtime based on the actual object type.

class Animal {
    public void makeSound() {
        System.out.println("Animal makes a sound");
    }
}

class Dog extends Animal {
    @Override
    public void makeSound() {  // Overriding the method
        System.out.println("Dog barks");
    }
}

class Cat extends Animal {
    @Override
    public void makeSound() {  // Overriding the method
        System.out.println("Cat meows");
    }
}

public class Main {
    public static void main(String[] args) {
        // Animal reference and Animal object
        Animal myAnimal = new Animal();
        // Upcasting - Animal reference but Dog object
        Animal myDog = new Dog();
        // Upcasting - Animal reference but Cat object
        Animal myCat = new Cat();  

        // Calls Animal's makeSound() at runtime
        myAnimal.makeSound();  // Output: Animal makes a sound
        // Calls Dog's makeSound() at runtime
        myDog.makeSound();     // Output: Dog barks
        // Calls Cat's makeSound() at runtime
        myCat.makeSound();     // Output: Cat meows
    }
}

3. Differences Between Overloading & Overriding

Feature	Method Overloading	Method Overriding
Definition	Multiple methods with the same name but different parameters in the same class.	A subclass provides a new implementation of a method defined in the superclass.
Binding Type	Compile-time polymorphism (early binding).	Runtime polymorphism (late binding).
Method Signature	Methods have different parameters (different number, types, or order).	Method signatures (name and parameters) must be exactly the same.
Return Type	Can be different.	Must be the same (or a covariant return type).
Access Modifier	No restrictions.	Cannot reduce visibility (e.g., protected method in parent cannot be private in child).
Static Methods	Can be overloaded.	Cannot be overridden (only hidden).

4. Benefits of Polymorphism

Code Reusability: Polymorphism allows you to write code that can work with objects of different classes, promoting code reuse. You can write code that works with objects of a general type and then reuse it with specific types.
Flexibility and Extensibility: New classes can be introduced with minimal changes to existing code, making the system more flexible and extensible. You can easily add new types to your program without needing to modify existing code.
Maintainability – Simplified Code: Polymorphism simplifies the code by allowing a single interface to represent different types of objects. Polymorphism makes your code easier to maintain because changes to one class are less likely to affect other parts of the program.

5. Example of Polymorphism

5.1. Example 1 – “abstract class” and “override”

abstract class Payment {
    abstract void processPayment(double amount);
}

class CreditCardPayment extends Payment {
    @Override
    void processPayment(double amount) {
        System.out.println("Processing credit card payment of $" + amount);
    }
}

class PayPalPayment extends Payment {
    @Override
    void processPayment(double amount) {
        System.out.println("Processing PayPal payment of $" + amount);
    }
}

public class Main {
    public static void main(String[] args) {
        Payment payment1 = new CreditCardPayment();
        Payment payment2 = new PayPalPayment();

        // Calls CreditCardPayment's processPayment()
        payment1.processPayment(100.50);
        // Calls PayPalPayment's processPayment()  
        payment2.processPayment(75.25);   
    }
}

In this example, the Payment class is abstract, enforcing a contract that all subclasses must follow. Polymorphism allows us to call processPayment() on different types of payments without changing the calling code.

5.2. Example 2 – “interface” and “implements”

interface Shape {
    double getArea();
}

class Circle implements Shape {
    // ...
}

class Rectangle implements Shape {
    // ...
}

// ...

Shape[] shapes = new Shape[10];
shapes[0] = new Circle();
shapes[1] = new Rectangle();
// ...

for (Shape shape : shapes) {
    double area = shape.getArea(); // Polymorphism!
    // ...
}

In this example, you can treat all the shapes in the array as Shape objects, even though they are actually different types of shapes. This is polymorphism in action! You can call the getArea method on any shape, and it will return the correct area for that specific shape.

6. Summay

Polymorphism in Java allows methods to do different things based on the object that invokes them, either through method overloading (compile-time polymorphism) or method overriding (run-time polymorphism). This concept enhances the flexibility, reusability, and maintainability of the code.

Polymorphism is a crucial concept in OOP. It allows for more flexible and maintainable code by enabling objects of different classes to be treated as objects of a common type. This is achieved through mechanisms like method overriding and late binding.

Polymorphism allows methods and objects to take multiple forms, improving flexibility and code reusability.

Method Overloading occurs at compile time (same method name, different parameters).
Method Overriding occurs at runtime (same method signature, different implementation in a subclass).
Polymorphism enables better scalability, maintainability, and code organization in Java applications.

January 29, 2025

Encapsulation in Object-Oriented Programming
Encapsulation is one of the four fundamental principles of Object-Oriented Programming (OOP), along with abstraction, inheritance, and polymorphism.

Encapsulation is the mechanism of bundling variables(data) and methods (functions) that operate on those variables into a single unit, called a class.

How encapsulation works and benefits:
- Data Hiding: Encapsulation promotes data hiding. By declaring class members as private, you restrict direct access to them from outside the class. This prevents accidental or intentional modification of the data, ensuring data integrity.
- Access Control: To allow controlled access to private members, you can create public methods (often called “getters” and “setters”) that provide read and write access to the data. These methods can include validation logic, ensuring that only valid values are assigned to the data.
- Increased Maintainability: Encapsulation makes code more maintainable. If you need to change how the data is stored or processed internally, you only need to modify the class’s methods, without affecting other parts of the program.
- Improved Reusability: Encapsulated classes are more reusable because they can be treated as self-contained units. You can easily use them in different parts of your program or even in other programs.
How encapsulation is implemented:
- In most OOP languages, encapsulation is implemented through classes and objects.
- Access Specifiers (or Modifiers), such as “private”, “protected”, and “public”, are used to control access to variables and methods.
  - “public” allows elements to be accessible from any other class in the application, regardless of the packages. Accessible from everywhere.
  - “private” restricts access to the elements only within the class they are declared. Accessible within the same class only.
  - “protected” allows access within the same package or in subclass, which might be in different packages. Accessible by the classes of the same package and the subclasses residing in any package.
  - default (no modifier specified) accessible by the classes of the same package.
Java Example:
```
public class Employee {

    private String name;
    private int age;

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        if (age >= 18) { // Age validation
            this.age = age;
        } else {
            System.out.println("Invalid age. Age must be 18 or above.");
        }
    }
}
```
In this example:
- name and age are private attributes, restricting direct access.
- getName() and setName() provide controlled access to the name attribute.
- getAge() and setAge() provide controlled access to the age attribute, including age validation.
By encapsulating data and methods within the Employee class, we ensure that the data is protected and can only be accessed and modified through the provided methods.
January 27, 2025
<cstring> vs <string> in C++
C-Style Strings and std::string of STL

In C++, strings are a sequence of characters used to represent text. C++ provides two primary ways to handle strings:
Key notes

1. C-Style Strings: Character arrays (from the C language).

2. std::string: A more modern and flexible string class provided by the Standard Template Library (STL).
1. C-Style Strings

A C-style string is a null-terminated character array.

Example

#include <iostream> #include <cstring> // For string functions like strlen, strcpy int main() { char str1[] = "Hello"; // String literal char str2[20]; // Array for storing strings std::cout << "Length of str1: " << strlen(str1) << std::endl; // Length of the string strcpy(str2, str1); // Copy str1 into str2 strcat(str2, ", World!"); // Concatenate strings std::cout << "str2: " << str2 << std::endl; // Output: Hello, World! return 0; }

Common Functions in <cstring>

• strlen(char*): Get the length of a string.

• strcpy(dest, src): Copy one string to another.

• strcat(dest, src): Concatenate strings.

• strcmp(str1, str2): Compare two strings.

Limitations

• Fixed size: Requires pre-defining the array size.

• Manual memory management: Risk of buffer overflows.

2. std::string (Preferred in Modern C++)

The std::string class is part of the C++ Standard Library and provides a safer, easier, and more flexible way to work with strings.

Key Features

• Dynamic sizing.

• Supports many useful operations as member functions.

• Automatically manages memory.

Example

#include <iostream> #include <string> // Required for std::string int main() { std::string str1 = "Hello"; // Initialize with a literal std::string str2 = "World"; // Concatenate strings using + std::string combined = str1 + ", " + str2 + "!"; // Length of the string std::cout << "Length: " << combined.length() << std::endl; // Access individual characters std::cout << "First character: " << combined[0] << std::endl; // Substring std::cout << "Substring: " << combined.substr(7, 5) << std::endl; // Output: World // Find a substring size_t pos = combined.find("World"); if (pos != std::string::npos) { std::cout << "'World' found at position: " << pos << std::endl; } // Replace part of the string combined.replace(7, 5, "Universe"); std::cout << "Replaced string: " << combined << std::endl; return 0; }

Key Functions of std::string

• Length and Capacity:

• .length() or .size(): Returns the number of characters.

• .capacity(): Returns the total allocated capacity.

• Modifying Strings:

• .append(): Appends another string.

• .insert(pos, str): Inserts a string at a position.

• .erase(pos, len): Erases characters from a position.

• .replace(pos, len, str): Replaces part of the string.

• Substring and Search:

• .substr(pos, len): Extracts a substring.

• .find(str): Finds the first occurrence of a substring.

• .rfind(str): Finds the last occurrence of a substring.

• Comparison:

• ==, !=, <, >: Direct comparison operators for strings.

• Access:

• str[index]: Access a character at a specific position.

3. Converting Between C-Style Strings and std::string

From C-Style to std::string:

char cstr[] = "Hello, World!"; std::string str = cstr; // Automatic conversion

From std::string to C-Style:

std::string str = "Hello, World!"; const char* cstr = str.c_str(); // Returns a C-style null-terminated string

Comparison Between C-Style Strings and std::string

Feature C-Style Strings std::string
Memory Management Manual Automatic
Dynamic Size No Yes
Ease of Use Low High
Performance Faster for small strings Slight overhead for safety
Functionality Limited Extensive

Best Practices

1. Prefer std::string over C-style strings for safety and flexibility.

2. Use C-style strings only when required for interoperability with C libraries or performance-critical scenarios.

3. Avoid direct pointer manipulation unless absolutely necessary.
January 7, 2025
What happens when you type a URL into your browser?
When you type a URL into your browser and press “Enter,” a complex sequence of actions occurs to fetch and display the requested web page. Here’s a detailed step-by-step explanation:

1. URL Parsing

The browser interprets the URL to understand:

• Protocol: e.g., HTTP or HTTPS.

• Domain Name: e.g., example.com.

• Path: e.g., /about.

• Query Parameters: e.g., ?key=value.

2. DNS Resolution

• The browser translates the domain name (e.g., example.com) into an IP address.

• This is done by querying the DNS (Domain Name System):

1) Check local DNS cache (browser, operating system).

2) If not found, query the configured DNS server.

3) The DNS server resolves the IP address and returns it to the browser.

Example: example.com → 93.184.216.34

3. Establish a TCP Connection

The browser establishes a TCP (Transmission Control Protocol) connection with the web server at the resolved IP address:

1) Three-Way Handshake:

• The browser sends a SYN packet to the server.

• The server responds with a SYN-ACK packet.

• The browser replies with an ACK packet, completing the handshake.

4. Secure the Connection (if HTTPS)

• For HTTPS, the browser initiates an SSL/TLS handshake to secure the connection:

1) The server presents its SSL certificate.

2) The browser verifies the certificate (ensuring the connection is secure and trustworthy).

3) Encryption keys are exchanged, and the connection becomes secure.

5. Sending the HTTP Request

The browser sends an HTTP or HTTPS request to the server. The request includes:

• Method: e.g., GET (fetch resource), POST (submit data).

• Headers: Metadata about the request (e.g., browser type, cookies).

• Payload (if applicable): Data sent with requests like POST.

Example Request:
```
GET /about HTTP/1.1

Host: example.com

User-Agent: Mozilla/5.0
```
6. Server Processing

• The server receives the request and processes it:

1) Match the request to a resource (e.g., HTML file, API endpoint).

2) Run backend logic, if necessary (e.g., database queries, authentication).

3) Generate an HTTP response with:

• Status Code: e.g., 200 (OK), 404 (Not Found), 500 (Server Error).

• Headers: Metadata (e.g., content type, cache rules).

• Body: The actual content (HTML, JSON, image, etc.).

Example Response:
```
HTTP/1.1 200 OK

Content-Type: text/html

Content-Length: 3421
```
7. Browser Receives the Response

The browser receives the server’s response and begins processing:

• If the response is compressed (e.g., gzip), the browser decompresses it.

8. Rendering the Web Page

1) HTML Parsing:

• The browser parses the HTML and constructs the DOM (Document Object Model) tree.

• It identifies additional resources (e.g., CSS, JavaScript, images) and requests them.

2) CSS Parsing:

• The browser parses CSS to style the DOM elements.

3) JavaScript Execution:

• JavaScript files are downloaded and executed to add interactivity or modify the DOM dynamically.

4) Layout and Painting:

• The browser calculates the layout of elements and renders them onto the screen.

9. Additional Requests

• As specified in the HTML, the browser sends additional requests for resources like images, CSS, and JavaScript.

• These resources are fetched, cached, and rendered.

10. Caching

• The browser may store static resources (e.g., CSS, images) locally based on caching rules from the server.

• This speeds up future visits to the same site.

11. User Interaction

• Once the page is loaded, the browser remains ready to handle user interactions, like clicks and form submissions.

Error Handling

If something goes wrong:

• DNS Failure: If the domain cannot be resolved, a “DNS not found” error is displayed.

• Server Down: If the server does not respond, a “Connection Timeout” error occurs.

• 404 Error: If the requested resource is missing.

This process involves multiple layers of protocols (DNS, TCP/IP, HTTP/HTTPS) and components (browser, server, DNS resolver) working seamlessly to deliver a web page in seconds.
January 6, 2025
Advantage of Handwriting Over Typing on Learning Programming Software

When it comes to learning programming or software development, handwriting has several advantages over typing, particularly in fostering deeper understanding and long-term retention of complex concepts. Here are the key benefits:

1. Better Conceptual Understanding

• Engages Cognitive Processing: Writing code by hand forces you to slow down and think critically about every line of code, syntax, and logic, as there’s no auto-correct or code completion to assist you. This process reinforces your understanding of how the code works.

• Encourages Syntax Memorization: Handwriting code requires you to internalize the syntax and logic of a programming language, which is critical for problem-solving in real-world scenarios.

2. Enhances Retention

• Stronger Memory Encoding: The physical act of handwriting improves memory retention compared to typing. This is particularly helpful when learning new programming concepts, algorithms, or data structures.

• Repetition Reinforces Learning: Writing out code repeatedly helps solidify key programming concepts and patterns in your mind.

3. Focus and Attention

• Minimizes Distractions: Unlike typing on a computer where you’re exposed to distractions (notifications, multitasking, or internet), handwriting keeps you focused on the learning material.

• Deep Work: Handwriting allows you to focus on the logic and flow of a program without relying on tools like IDEs or debuggers that might encourage trial-and-error coding.

4. Improves Problem-Solving Skills

• Algorithm Planning: Writing algorithms, pseudocode, or flowcharts on paper helps you visualize the problem better and develop logical solutions before jumping into implementation.

• Debugging Mindset: Handwriting code teaches you to double-check for errors before “running” it, simulating the real-world need to debug effectively.

5. Encourages a Deeper Connection to the Code

• No Auto-Completion or Syntax Highlighting: Without assistance from an IDE, you develop a deeper understanding of the programming language, learning its nuances and intricacies.

• Foundational Learning: Handwriting code builds a strong foundation in programming, which helps you transition to advanced development later.

6. Better for Learning Algorithms and Data Structures

• Handwriting diagrams for data structures like linked lists, trees, and graphs, along with their algorithms, helps you visualize relationships and logic better than typing.

• Example: Drawing out a binary search tree and writing the recursive function manually ensures you understand the concept deeply before typing it out.

7. Improves Exam and Interview Preparation

• Coding Interviews: Many programming job interviews require you to write code on a whiteboard or on paper. Practicing handwriting improves your ability to write clear, concise, and error-free code in such scenarios.

• Exams: If you’re studying programming in an academic setting, exams often require handwritten answers for theoretical and practical coding questions.

8. Boosts Creativity

• Brainstorming Ideas: Writing by hand lets you freely draw diagrams, arrows, or notes alongside your code, which enhances creativity and problem-solving.

• Flowcharts and Sketches: Creating flowcharts or pseudocode on paper helps organize your thought process better than directly typing.

When Should You Handwrite While Learning Programming?

1) Understanding New Concepts: Write out examples and notes by hand to retain syntax and logic.

2) Algorithm Design: Sketch algorithms, flowcharts, and pseudocode to visualize the problem.

3) Interview Preparation: Practice solving problems on paper or a whiteboard.

4) Studying Data Structures: Draw and analyze structures like arrays, linked lists, and graphs.

Handwriting and Typing: A Balanced Approach

While handwriting is excellent for learning and understanding foundational concepts, typing has its own advantages when implementing and testing code. A balanced approach works best:

• Start with Handwriting: Write algorithms, pseudocode, and small code snippets to understand concepts.

• Transition to Typing: Once you’re comfortable, type the code into an IDE to test and debug.

Conclusion

Handwriting is invaluable for learning programming because it:

• Reinforces memory and understanding.

• Builds strong foundational skills.

• Prepares you for coding interviews and exams.

However, once you’ve mastered the basics, typing becomes essential for real-world implementation, testing, and development. Combining both methods will give you the best results in your programming journey.

January 6, 2025
What Is Back-of-the-Envelope Estimation?

Back-of-the-envelope estimation is a rough calculation used to estimate the feasibility, cost, or scale of a project or problem without delving into highly detailed analysis or precise data. It’s a quick and practical way to get an approximate answer based on reasonable assumptions.

This technique is often used in system design, engineering, product planning, and brainstorming sessions to:

• Determine whether an idea is worth pursuing.

• Spot potential bottlenecks or challenges.

• Validate the feasibility of solutions or architectures.

Steps for a Back-of-the-Envelope Estimation:

1. Define the Problem Clearly:

• Understand what you’re estimating (e.g., storage requirements, bandwidth, number of servers).

• Break the problem into smaller, manageable components.

2. Make Assumptions:

• Use approximate values based on industry standards or prior knowledge.

• Err on the side of simplicity and round numbers for quick calculations.

3. Perform Calculations:

• Use basic arithmetic to combine assumptions and get the result.

4. Validate Assumptions:

• Ensure assumptions are reasonable.

• Adjust estimates if the results seem unrealistic.

5. State Assumptions Explicitly:

• Communicate the assumptions clearly so others can understand and refine your calculations.

Example 1: Estimating Storage for a Chat Application

Problem: Estimate the annual storage requirements for a messaging app with 1 million daily active users (DAUs).

Assumptions:

1. Each user sends 50 messages per day.

2. Average message size is 200 bytes.

3. Messages are stored for 1 year.

Calculation:

1. Total messages per day = 1,000,000 users × 50 messages = 50,000,000 messages/day.

2. Total storage per day = 50,000,000 messages × 200 bytes = 10 GB/day.

3. Annual storage = 10 GB/day × 365 days = 3.65 TB/year.

Result: You would need approximately 3.65 TB of storage per year.

Example 2: Bandwidth for a Video Streaming Service

Problem: Estimate the bandwidth required to stream 1080p videos to 10,000 concurrent users.

Assumptions:

1. 1080p video uses 5 Mbps.

2. All users are watching simultaneously.

Calculation:

1. Bandwidth per user = 5 Mbps.

2. Total bandwidth = 5 Mbps × 10,000 users = 50,000 Mbps = 50 Gbps.

Result: You need a network capable of handling 50 Gbps of traffic.

Common Use Cases:

1. System Design:

• Estimating server requirements (CPU, memory, disk).

• Predicting data throughput or storage needs.

2. Capacity Planning:

• Assessing whether existing infrastructure can handle growth.

3. Financial Projections:

• Quick revenue or cost estimations for business models.

4. Engineering Problems:

• Estimating power consumption, materials, or load capacities.

Tips for Effective Back-of-the-Envelope Estimations:

1. Use Reasonable Numbers:

• Simplify complex calculations by rounding numbers (e.g., use 1M instead of 1,048,576).

2. Focus on Key Factors:

• Ignore minor details or edge cases unless they significantly impact the result.

3. Cross-Check Results:

• Compare your estimate to known benchmarks or real-world data.

4. Iterate:

• Refine your assumptions and calculations if more precise data becomes available.

Back-of-the-envelope calculations are invaluable in system design, initial project discussions, and rapid brainstorming.

December 30, 2024
Securely Storing Passwords in System – Password with Salt
Using a password with salt is a common best practice for securely storing passwords in a system. It protects against rainbow table attacks by introducing a random value (the salt) to each password before it is hashed.

What Is Salt?

• A salt is a unique, random value that is added to a password before it is hashed.

• It ensures that even if two users have the same password, their hashed values will be different because the salt is unique for each user.

Steps for Storing Passwords with Salt:

1. Generate a Salt:

• Use a cryptographically secure random generator to create a unique salt for each user.

2. Combine Salt and Password:

• Append or prepend the salt to the user’s password.

3. Hash the Combined Value:

• Use a secure hashing algorithm (e.g., SHA-256, bcrypt, Argon2).

4. Store the Salt and Hash:

• Save both the salt and the resulting hash in your database.

• Ensure the salt is stored in plaintext, as it is not secret.

Example Implementation in Python

Here’s an example using Python with the bcrypt library:

Code Example:
```
import bcrypt

# Function to hash a password with salt

def hash_password(password: str) -> tuple:

    # Generate a salt

    salt = bcrypt.gensalt()

    # Hash the password with the salt

    hashed_password = bcrypt.hashpw(password.encode('utf-8'), salt)

    return salt, hashed_password

# Function to verify a password

def verify_password(password: str, salt: bytes, stored_hash: bytes) -> bool:

    # Recompute the hash using the provided password and salt

    computed_hash = bcrypt.hashpw(password.encode('utf-8'), salt)

    return computed_hash == stored_hash

# Example usage

if __name__ == "__main__":

    user_password = "SecurePassword123!"

    # Hash the password

    salt, stored_hash = hash_password(user_password)

    print(f"Salt: {salt}")

    print(f"Stored Hash: {stored_hash}")

    # Verify the password

    is_valid = verify_password("SecurePassword123!", salt, stored_hash)

    print("Password is valid:", is_valid)
```
Example Database Storage:

Store the salt and hash separately in your database:

User ID Salt Hash
1 random_salt_value hashed_password

Best Practices:

1. Use a Secure Salt Generator:

• Always use cryptographically secure random number generators for salts.

• In Python, os.urandom() or libraries like bcrypt are recommended.

2. Avoid Weak Hashing Algorithms:

• Do not use MD5 or SHA-1, as they are vulnerable to attacks.

• Use algorithms designed for passwords, such as:

• bcrypt: Adds internal salt and is computationally expensive.

• Argon2: Modern and highly secure (recommended).

• PBKDF2: Widely used and secure.

3. Don’t Reuse Salts:

• Always generate a unique salt for each password.

4. Limit Login Attempts:

• Use rate-limiting or captchas to protect against brute-force attacks.

Why Is Salting Important?

Without a salt:

• Identical passwords will have identical hashes.

• Attackers can use precomputed hashes from rainbow tables to crack passwords.

With a salt:

• Each password gets a unique hash, making it infeasible for rainbow tables to be effective.
December 30, 2024

What is Kafka? Why is Kafka fast?

What is Kafka?

Apache Kafka is a distributed event streaming platform designed for high-throughput, fault-tolerant, and real-time data processing. It is commonly used to build real-time data pipelines and streaming applications. Kafka was originally developed at LinkedIn and later open-sourced as part of the Apache Software Foundation.

Kafka serves three primary purposes:

1. Message Broker: Decouples producers (senders) and consumers (receivers) by acting as an intermediary.

2. Stream Processor: Provides tools to process streams of data in real-time.

3. Storage System: Durable storage of message streams for later retrieval and processing.

Core Concepts of Kafka

1. Producers:

• Applications or services that write data (messages) to Kafka topics.

2. Consumers:

• Applications or services that read data from Kafka topics.

3. Topics:

• Categories or logical channels to which producers send messages and consumers subscribe.

4. Partitions:

• A topic is divided into one or more partitions for scalability and parallelism. Each partition stores an ordered log of messages.

5. Brokers:

• Kafka servers that store topic data. A Kafka cluster consists of multiple brokers.

6. Replication:

• Kafka ensures fault tolerance by replicating partitions across multiple brokers.

7. Zookeeper (or KRaft in newer versions):

• Used to manage Kafka cluster metadata, leader elections, and configurations (KRaft is Kafka’s native replacement for Zookeeper).

Why is Kafka Fast?

Kafka’s exceptional performance stems from its architectural design and optimized use of resources. Here are the key reasons:

1. Sequential I/O and Batching

• Kafka writes and reads messages sequentially to/from disk using append-only logs.

• Sequential writes are much faster than random writes due to reduced disk seek overhead.

• Messages are batched before being written to disk or sent over the network, further reducing I/O operations.

2. Zero-Copy Mechanism

• Kafka uses the sendfile() system call to transfer data directly from disk to network sockets without copying it into application memory.

• This reduces context switches between user space and kernel space, leading to lower latency and higher throughput.

3. Efficient Data Storage

• Kafka does not track individual message acknowledgments; instead, it uses offset-based tracking.

• Each message in a partition has a unique offset, and consumers maintain their own offsets.

• This simplifies the storage model, reduces overhead, and increases throughput.

4. Partitioning and Parallelism

• Topics are split into partitions, which allow parallel processing.

• Each partition can be independently processed by different consumers in a consumer group, making Kafka highly scalable.

5. Replication with Leader-Follower Model

• Kafka partitions have a leader that handles all reads and writes, while replicas act as backups.

• The leader handles client requests, ensuring consistent and efficient operations.

6. Asynchronous Processing

• Producers and consumers communicate asynchronously, reducing blocking and latency.

• Producers do not wait for consumers to process messages, and vice versa.

7. Backpressure Handling

• Kafka can handle a large number of messages without overwhelming the system by allowing consumers to process messages at their own pace.

• Messages remain in Kafka until consumers explicitly commit that they’ve been processed.

8. High Compression

• Kafka supports message compression (e.g., GZIP, Snappy, LZ4) at the producer level.

• Compressed messages reduce storage and network bandwidth usage.

9. Minimized Coordination Overhead

• Kafka reduces inter-node coordination by storing and replicating data at the partition level rather than managing individual messages.

• Consumers maintain their own offsets, eliminating the need for Kafka to track message delivery statuses.

Kafka vs Traditional Message Brokers

Feature	Kafka	Traditional Brokers (e.g., RabbitMQ)
Data Persistence	Stores messages persistently by default	Often optimized for in-memory storage
Scalability	Horizontally scalable with partitions	Limited scalability
Message Consumption	Pull-based (consumers fetch messages)	Push-based (brokers send messages)
Performance	Optimized for high throughput and low latency	Generally slower
Fault Tolerance	Built-in replication	Depends on implementation

Typical Use Cases for Kafka

1. Log Aggregation:

• Collect and centralize application logs for monitoring or analysis.

2. Real-Time Data Pipelines:

• Stream data between microservices or systems (e.g., ETL pipelines).

3. Stream Processing:

• Process data in real-time using Kafka Streams or tools like Apache Flink.

4. Event Sourcing:

• Store a record of every change to an application state.

5. IoT and Sensor Data:

• Collect and analyze data from IoT devices.

December 30, 2024

Git – Distributed Version Control System

Git is a distributed version control system that helps developers track changes in source code, collaborate with others, and manage projects efficiently. It was created by Linus Torvalds in 2005 to support the development of the Linux kernel and has since become one of the most widely used tools in software development.

Key Features of Git:

1. Version Control: Tracks changes to files over time, enabling you to restore previous versions or compare changes.

2. Distributed: Every developer has a full copy of the entire repository (history and all), which makes it faster and allows for offline work.

3. Branching and Merging: Developers can create branches for new features or bug fixes, work on them independently, and merge them back into the main branch.

4. Collaboration: Allows multiple developers to work on the same project without overwriting each other’s changes.

5. Efficiency: Optimized for speed and performance with low overhead.

Core Concepts in Git:

1. Repository (Repo):

• A directory containing your project files and the entire history of changes.

• Initialized with git init.

2. Commit:

• A snapshot of changes in your code.

• Created using git commit.

3. Branch:

• A parallel line of development.

• The default branch is usually main or master.

4. Merge:

• Combines changes from one branch into another, often from a feature branch into the main branch.

5. Pull and Push:

• Pull: Updates your local repository with changes from a remote repository.

• Push: Uploads your changes from the local repository to a remote repository.

6. Remote:

• A version of your repository stored on a server (e.g., GitHub, GitLab, Bitbucket).

• Added with git remote add.

Basic Git Commands:

Command Description

Command	Description
git init	Initialize a new Git repository.
git clone <url>	Clone an existing remote repository.
git status	Show the status of changes in the working directory.
git add <file>	Stage changes for the next commit.
git commit -m “<msg>”	Commit the staged changes with a message.
git branch	List branches or create a new branch.
git checkout <branch>	Switch to a different branch.
git merge <branch>	Merge a branch into the current branch.
git pull	Fetch and merge changes from a remote repository.
git push	Push your changes to a remote repository.
git log	View commit history.

Git Workflow:

1. Clone or Initialize a Repository:

• Clone: git clone <repository_url>

• Initialize: git init

2. Make Changes:

• Edit files as needed.

3. Stage Changes:

• Add modified files to the staging area using git add.

4. Commit Changes:

• Save changes to the repository with git commit.

5. Sync with Remote:

• Push changes to a remote repository using git push.

• Pull updates from the remote repository with git pull.

GitHub Integration:

GitHub is a popular hosting service for Git repositories, offering features like pull requests, issue tracking, and CI/CD pipelines.

1. Host Repositories: Store your projects and collaborate with others.

2. Pull Requests: Review and discuss code changes before merging.

3. Issue Tracking: Track bugs, tasks, and feature requests.

4. CI/CD Pipelines: Automate testing and deployment.

Best Practices:

1. Commit Often: Make small, logical commits with descriptive messages.

2. Use Branches: Separate development tasks to avoid conflicts.

3. Pull Frequently: Sync changes to avoid merge conflicts.

4. Resolve Conflicts Carefully: Use tools like git merge or git rebase.

5. Document Workflow: Define a branching strategy (e.g., Git Flow or trunk-based development).

December 28, 2024

Data Structures and Corresponding STL

The C++ Standard Template Library (STL) is a collection of powerful and flexible generic classes and functions designed to manage common data structures and algorithms. It provides efficient, reusable implementations of many data structures, which makes it easier to write high-performance code. Below is a detailed overview of common data structures and their corresponding STL components.

1. Arrays

Description:

Fixed-size, sequential containers for elements of the same type.
Used for storing data in contiguous memory.

STL Component:

std::array (C++11)
std::vector

Operations:

Random access: O(1)O(1)
Insertion/Deletion: O(n)O(n) for shifting elements.

Example:

#include <iostream>
#include <array>

int main() {
    std::array<int, 5> arr = {1, 2, 3, 4, 5};
    for (const auto& val : arr) {
        std::cout << val << " ";
    }
    return 0;
}

2. Linked List

Description:

Sequential data structure with nodes that hold data and a pointer to the next node.

STL Component:

std::list (Doubly Linked List)
std::forward_list (Singly Linked List)

Operations:

Insertion/Deletion at any position: O(1)O(1)
Traversal: O(n)O(n)

Example:

#include <iostream>
#include <list>

int main() {
    std::list<int> lst = {10, 20, 30};
    lst.push_back(40);   // Add to the end
    lst.push_front(5);   // Add to the beginning

    for (const auto& val : lst) {
        std::cout << val << " ";
    }
    return 0;
}

3. Stack

Description:

Last-In-First-Out (LIFO) data structure.

STL Component:

std::stack (adaptor over std::deque)

Operations:

Push: O(1)O(1)
Pop: O(1)O(1)
Top (peek): O(1)O(1)

Example:

#include <iostream>
#include <stack>

int main() {
    std::stack<int> st;
    st.push(10);
    st.push(20);
    st.push(30);

    while (!st.empty()) {
        std::cout << st.top() << " ";
        st.pop();
    }
    return 0;
}

4. Queue

Description:

First-In-First-Out (FIFO) data structure.

STL Component:

std::queue (adaptor over std::deque)
std::deque (Double-Ended Queue)

Operations:

Enqueue: O(1)O(1)
Dequeue: O(1)O(1)
Front/Back: O(1)O(1)

Example:

#include <iostream>
#include <queue>

int main() {
    std::queue<int> q;
    q.push(10);
    q.push(20);
    q.push(30);

    while (!q.empty()) {
        std::cout << q.front() << " ";
        q.pop();
    }
    return 0;
}

5. Priority Queue

Description:

Specialized queue where elements are dequeued in priority order (not FIFO).
Default is a max-heap.

STL Component:

std::priority_queue

Operations:

Push: O(log⁡n)O(\log n)
Pop: O(log⁡n)O(\log n)
Top: O(1)O(1)

Example:

#include <iostream>
#include <queue>

int main() {
    std::priority_queue<int> pq;
    pq.push(10);
    pq.push(20);
    pq.push(15);

    while (!pq.empty()) {
        std::cout << pq.top() << " ";
        pq.pop();
    }
    return 0;
}

Output:

20 15 10

6. Set

Description:

Ordered container for unique elements.

STL Component:

std::set
std::unordered_set (Hash-based)

Operations:

Insertion/Deletion/Find: O(log⁡n)O(\log n) for std::set, O(1)O(1) for std::unordered_set.

Example:

#include <iostream>
#include <set>

int main() {
    std::set<int> s;
    s.insert(10);
    s.insert(20);
    s.insert(20); // Ignored as duplicate

    for (const auto& val : s) {
        std::cout << val << " ";
    }
    return 0;
}

7. Map

Description:

Key-value pairs with unique keys.

STL Component:

std::map (Ordered)
std::unordered_map (Hash-based)
std::multimap (Allows duplicate keys)

Operations:

Insertion/Deletion/Access: O(log⁡n)O(\log n) for std::map, O(1)O(1) for std::unordered_map.

Example:

#include <iostream>
#include <map>

int main() {
    std::map<int, std::string> m;
    m[1] = "One";
    m[2] = "Two";

    for (const auto& pair : m) {
        std::cout << pair.first << ": " << pair.second << std::endl;
    }
    return 0;
}

8. Graph

Description:

A network of nodes (vertices) connected by edges.
Can be represented using adjacency lists or matrices.

STL Component:

std::vector<std::vector<int>> (Adjacency List)
std::set or std::map for weighted edges.

Example: Adjacency List

#include <iostream>
#include <vector>

int main() {
    std::vector<std::vector<int>> graph(5);

    // Add edges
    graph[0].push_back(1);
    graph[1].push_back(2);
    graph[2].push_back(3);

    // Print adjacency list
    for (int i = 0; i < graph.size(); i++) {
        std::cout << i << ": ";
        for (int neighbor : graph[i]) {
            std::cout << neighbor << " ";
        }
        std::cout << std::endl;
    }
    return 0;
}

9. Hash Tables

Description:

Containers for key-value pairs with hash-based lookups.

STL Component:

std::unordered_map
std::unordered_set

Operations:

Insertion/Deletion/Access: O(1)O(1)

Example:

#include <iostream>
#include <unordered_map>

int main() {
    std::unordered_map<std::string, int> umap;
    umap["Apple"] = 10;
    umap["Banana"] = 20;

    for (const auto& pair : umap) {
        std::cout << pair.first << ": " << pair.second << std::endl;
    }
    return 0;
}

Summary of Common Data Structures and STL

Data Structure	STL Component	Insertion	Lookup	Order
Array	std::array, std::vector	O(1)O(1)	O(1)O(1)	Sequential
Linked List	std::list, std::forward_list	O(1)O(1)	O(n)O(n)	Sequential
Stack	std::stack	O(1)O(1)	O(1)O(1)	LIFO
Queue	std::queue, std::deque	O(1)O(1)	O(1)O(1)	FIFO
Priority Queue	std::priority_queue	O(log⁡n)O(\log n)	O(1)O(1)	By Priority
Set	std::set, std::unordered_set	O(log⁡n)O(\log n)	O(log⁡n)O(\log n)	Sorted
Map	std::map, std::unordered_map	O(log⁡n)O(\log n)	O(log⁡n)O(\log n)	By Key

December 28, 2024

Feature	C-Style Strings	std::string
Memory Management	Manual	Automatic
Dynamic Size	No	Yes
Ease of Use	Low	High
Performance	Faster for small strings	Slight overhead for safety
Functionality	Limited	Extensive

User ID	Salt	Hash
1	random_salt_value	hashed_password

Author: June Hong

1. What is Polymorphism

2. Key Concepts of Polymorphism in Java

2.1. Compile-time polymorphism (Method Overloading) – Static Binding

2.2. Runtime polymorphism (Method Overriding) – Dynamic Binding

3. Differences Between Overloading & Overriding

4. Benefits of Polymorphism

5. Example of Polymorphism

5.1. Example 1 – “abstract class” and “override”

5.2. Example 2 – “interface” and “implements”

6. Summay

C-Style Strings and std::string of STL

Key notes

1. Arrays

Description:

STL Component:

Operations:

Example:

2. Linked List

Description:

STL Component:

Operations:

Example:

3. Stack

Description:

STL Component:

Operations:

Example:

4. Queue

Description:

STL Component:

Operations:

Example:

5. Priority Queue

Description:

STL Component:

Operations:

Example:

6. Set

Description:

STL Component:

Operations:

Example:

7. Map

Description:

STL Component:

Operations:

Example:

8. Graph

Description:

STL Component:

Example: Adjacency List

9. Hash Tables

Description:

STL Component:

Operations:

Example:

Summary of Common Data Structures and STL