Data Structures
- A way of effectively organizing data
- A collection of data type ‘values’ that allow for:
- efficient access and modifications
- efficient access and modifications
- A collection of data type ‘values’ that allow for:
- Data structure basic forms:
- Linear: array, list, stack, queue
- Tree: binary tree, heap, space partition etc.
- Hash: distributed hash table, hash tree, dictionary etc.
- Graphs: decision, directed, acyclic etc.
Signed – positive and negative values.
Unsigned – only positive values (double-positive values range).
Data Types
- Primitive
- Basic data types such as a boolean, integer, float, or a char
- Composite
- Any data type composed of primitive or composite types
- E.g. struct, array, string etc.
- Any data type composed of primitive or composite types
- Abstract
- A data type that is defined by its behaviour
- E.g. tuple, set, stack, queue, graph etc.
- A data type that is defined by its behaviour
Arrays
- A collection of elements
- Where each item is identified by an index or a key
- Can be a multidimensional or single node
- Jagged arrays (e.g., in C# – mixes length arrays)
- Basic arrays are not resizable, i.e. they’re immutable
- Values are referenced by the index
- Calculating item index = Θ(1)
- Inserting or deleting items the beginning/ middle = Θ(n)
- Inserting or deleting item at the end = Θ(1)
Linked List
- A collection of items, nodes
- Each consisting of the day and a pointer that the next node
- The data is what you’ve assigned to that element/node, whereas the pointer is a memory address that refers to the next node in the list
- Each consisting of the day and a pointer that the next node
- Similar to arrays but with fewer limitations
- A linked list is different from an array in that a contiguous memory allocation does not determine the order of the elements within the list.
- Elements of the linked list can be sporadically placed in the memory, making it much more efficient.
- The singly-linked list has a point to the next, while the double linked list has a pointer to the previous and next element
- Item lookup is linear in time complexity Θ(n)
When to use Linked Lists
- When you need constant-time insertions/deletions from the list
- Such as in real-time computing where time predictability is absolutely critical
- If you don’t know how many items will be in the list
- With arrays, you may need to re-declare and copy memory if the array grows too big
- When you don’t need random access to any elements
- If you want to be able to insert items in the middle of the list
- such as a priority queue
Custom LinkedList Implementation in C# and Python
Note: manly languages including C# and Java come with a built-in LinkedList<T>
type. In C# the LinkedList<T> is a general-purpose linked list available from System.Collections.Generic
namespace.
using System;
namespace SinglyLinkedList
{
public class Node
{
public int data;
public Node next;
public void displayNode()
{
Console.WriteLine("<" + data + ">");
}
}
public class SinglyLinkedList
{
private Node first;
public bool isEmpty()
{
return (first == null);
}
public void insertFirst(int data)
{
Node newNode = new Node();
newNode.data = data;
newNode.next = first;
first = newNode;
}
public Node deleteFirst()
{
Node temp = first;
first = first.next;
return temp;
}
public void displayList()
{
Console.WriteLine("First -> Last) ");
Node current = first;
while (current != null)
{
current.displayNode();
current = current.next;
}
Console.WriteLine();
}
public void insertLast(int data)
{
Node current = first;
while (current.next != null)
{
current = current.next;
}
Node newNode = new Node();
newNode.data = data;
current.next = newNode;
}
}
}
class Node:
def __init__(self, data = None, next=None):
self.data = data
self.next = next
class SinglyLinkedList:
def __init__(self):
self.head = None
def insert(self, data):
newNode = Node(data)
if(self.head):
current = self.head
while(current.next):
current = current.next
current.next = newNode
else:
self.head = newNode
def print_nodes(self):
current = self.head
while(current):
print(current.data)
current = current.next
Stacks
- A linear data structure that follows a particular order in which the operations are performed
- Last In-First Out (LIFO)
- Data can only be added to or removed from the top of the stack
- The stack has two principal operations: push (add) and pop (delete)
- peek – used to return the item at the top of the stack
- isEmpty – used to check whether the stack contains items
- Time complexity for item access and search is Θ(n) and Θ(1) for inserts and deletes
- Space complexity is Θ(n)
When to use stacks
- Reverse Polish notation for evaluation arithmetic expressions
- Syntax and Expression processing
- Undo, Redo operations
- Used in recursion and low-level assembly programming
- Cold stack (space for local params), Backtracking: e.g. browser back stack
C# Implementation
using static System.Console;
namespace Stack
{
public class Stack
{
private int maxSize;
private string[] stackArray;
private int top;
public Stack(int size)
{
maxSize = size;
stackArray = new string[maxSize];
top = -1;
}
public void push(string m)
{
if (isFull())
{
WriteLine("Stack is full");
}
else
{
top++;
stackArray[top] = m;
}
}
public string pop()
{
if (isEmpty())
{
WriteLine("Stack is empty.");
return "--";
}
else
{
int old_top = top;
top--;
return stackArray[old_top];
}
}
public string peek()
{
return stackArray[top];
}
public bool isEmpty()
{
return (top == -1);
}
private bool isFull()
{
return (maxSize - 1 == top);
}
}
}
Python Implementation
class Stack:
def __init__(self):
self.items = []
def is_empty(self):
return not self.items
def push(self, item):
return self.items.append(item)
def pop(self):
return self.items.pop()
def peek(self):
return self.items[-1]
def size(self):
return len(self.items)
def __str__(self):
return str(self.items)
if __name__ == "__main__":
s = Stack()
s.push(992221555)
print(s.size())
s.pop()
print(s.peek())
print(s)
Queues
- A linear structure that follows a particular order in which the operations are performed
- First In First Out (FIFO)
- Items are removed from the list in order in which they were added
- When we insert an item onto the queue this is known as the ENQUEUE function, whilst deleting an item is referred to as the DEQUEUE function
When to use queues
- Order processing
- Messaging
Custom Implementation (C#)
using static System.Console;
namespace QueueDemo
{
public class Queue
{
private int maxSize;
private long[] myQueue;
private int front;
private int rear;
private int items;
public Queue(int size)
{
maxSize = size;
myQueue = new long[size];
front = 0;
rear = -1;
items = 0;
}
public void insert(long j)
{
if (isFull())
{
WriteLine("Queue is full");
}
else
{
if (rear == maxSize - 1)
{
rear = -1;
}
rear++;
myQueue[rear] = j;
items++;
}
}
public long remove()
{
long temp = myQueue[front];
front++;
if (front == maxSize)
{
front = 0;
}
return temp;
}
public long peekFront()
{
return myQueue[front];
}
public bool isEmpty()
{
return (items == 0);
}
private bool isFull()
{
return (items == maxSize);
}
public void view()
{
Write("[");
for (int i = 0; i < myQueue.Length; i++)
{
Write(myQueue[i] + " ");
}
WriteLine("]");
}
}
}
Big O Notation
- The notation used to describe the performance or (time) complexity of an algorithm
- O(1)
- Constant
- Consistent duration of algorithm execution regardless of the size of the input
- E.g. determining if a binary number is even or odd, calculating , or using a constant-size lookup table
- On(log n)
- Logarithmic
- 10x the data = 2x more time
- E.g. finding an item in a sorted array with a binary search or a balanced search tree or all operations in a Binomial heap
- O(n):
- Linear
- 10x the data = 10x more time
- E.g. finding an item in an unsorted list or in an unsorted array adding two n-bit integers by a ripple carry
- O(n^2)
Big O Cheatsheet – https://www.bigocheatsheet.com/
Algorithms
- A self-contained sequence of actions
- Perform calculations, data processing, and automated tasks
- An efficient code
When to Use Algorithms
- Compressing data
- Sorting data
- Generating random numbers
Recursion
- When a function calls itself
- Recursive functions need to have a breaking condition
- To prevent infinite loops and eventual crashes
- Each time the function is called, the old arguments are saved
- This is called “call stack”
function countdown(n){
if(n==0){
alert('done!');
} else {
alert(n + "...");
countdown(n-1);
}
}
countdown(10);
Linear Search
- In computer science, a linear search or sequential search is a method for finding an element within a list
- It sequentially checks each element of the list until a match is found or the whole list has been searched
- Very inefficient search
Custom Implementation (C#)
using System;
using static System.Console;
namespace LinearSearch
{
class Program
{
static void Main(string[] args)
{
int theValue = 7;
int[] array = new int[] { 1, 4, 5, 7, 9, 22 };
WriteLine("Our array contains:");
Array.ForEach(array, x => Write(x + " "));
Write($"\n\nThe result of a linear search " +
$"for {theValue} is: ");
WriteLine(linearSearch(array,theValue));
}
static int linearSearch(int[] a, int x) {
for (int i = 0; i < a.Length; i++)
{
if (a[i]==x)
{
return i;
}
}
return -1;
}
}
}
Binary Search
- In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array
- Binary search compares the target value to the middle element of the array
Custom Implementation (C#)
using System;
using static System.Console;
namespace BinarySearch
{
class Program
{
static void Main(string[] args)
{
int theValue = 43;
int[] array = new int[] { 11, 22, 43, 54, 57, 59, 62, 78 };
WriteLine("Our array contains: ");
Array.ForEach(array, x => Write(x + " "));
Write($"\n\nThe result of a binary search for {theValue} is: ");
WriteLine(binarySearch(array, theValue));
}
/// <summary>
/// a = array
/// x = what we're searching
/// p = first index
/// q = midpoint
/// r = last index
/// </summary>
public static int binarySearch(int[] a, int x) {
// Step 1 - intialize the variables
int p = 0; // beginning of the range
int r = a.Length - 1; // the end of the range aka last slot
// Step 2 - we do something (search for our value)
while (p <= r) // when true, we are still in the range
{
int q = (p + r) / 2; // find the midpoint
if (x < a[q])
r = q - 1; // set r to mid point. We narrowed to 1st
// half of the array in the case x is less
// the data in slot q
else if (x > a[q])
p = q + 1; // the opposite occurs. we bring p to the
// right of the array
else return q; // here we found our value b/c x must = q
}
// Step 3 - indicate not found
return -1;
}
}
}