Data Structures - Part 1

Data Structures - Part 1

Data Structures (Abstract Data Types) - Part 1 (Array)

Data structures are specific methods of organizing and storing collections of data in a way that enables efficient access and manipulation. In a previous article, we discussed python specific data structures (built-in data types: list, tuple, set & dictionary). In this series of articles, we will explore common abstract data types (ADTs) that are language-agnostic. These ADTs can be classified into two categories: linear and non- linear.

Linear data structures; Array, Linked List, Stack, Queue
Non-linear data structures; Tree, Hash Map, Graph

1. Array

Arrays are one of the most fundamental data structures, designed to store collections of data (elements) of the same data type. These elements are stored in contiguous memory locations, allowing access to each element using its index. Arrays are prone to an exception if accessed with an index that exceeds the array’s bounds. (Python - IndexError, Java - ArrayIndexOutOfBoundsException)

Arrays are typically of fixed lengths. But some languages offer dynamic arrays (Python - List, Java - Vector, ArrayList) that can be resized as needed. Arrays have a length property which can be used to identify the capacity of the array. Arrays provide operations to insert, search, delete and read elements.

Following code example demonstrates how to implement an Array data structure using Python’s built-in list and its functions. Functions insert(), append() and extend() are used to insert elements to the array. To perform search operation, we use the in keyword with conditionals and iterators to iterate through the list, and function index() is used to get the index of the element. Similarly, for the delete operation we use conditionals and in keyword to iterate through the list and function remove() is used to delete the element. Additionally we have used iterators to search and remove multiple occurrences of a value.

First let’s create an array and insert some values. We have used the list() function to initiate an array. Alternatively we may use []. Then we can use the append() function to insert elements one by one to the end of the list. The function insert() can be used to insert values to a specific place by passing the index as the first argument. Therefore we can use this function to add elements in the front. If we want to add multiple values to a list we may use the extend() function and pass the values as a list. We can display the data by calling the variable name. We can use indexing to get specific values or slicing to get a range of values.

# CREATE
numbers = list()    # numbers = [] or numbers = [1,2,3]


# INSERT
# Insert a range of values
for i in range(1, 11, 2):     # Odd numbers from 1 to 10
    numbers.append(i)
# Insert a value at specific index
numbers.insert(0, 150) # Insert 150 at the 0th index
# Insert a value at the end
numbers.append(600)
# Insert multiple values at the end
numbers.extend([2, 2, 2, 2, 2])


# RETRIEVE
# Read entire array
print("Initial array:", numbers)
# Read a range of values
print("Sliced array:", numbers[2:-2]) # 3rd to 2nd last elements 
# Read single value
print("1st element: ", numbers[0])
print("Last element:", numbers[-1], end="\n\n")

  • Creating the above empty array has O(1) time and space complexity since it involves allocating a fixed block of memory. It has O(n) space complexity if we initialize it with values, since it needs space to store n elements.

  • Appending an element at the end has O(1) amortized time complexity since no shifting is needed in most cases, although occasional resizing may occur. It has O(1) space complexity as it only requires space for the added element.

  • Inserting an element at the beginning or middle has O(n) time complexity because elements need to be shifted to accommodate the new element. It has O(1) space complexity as no additional space is required except for the new element.

  • Adding multiple elements has O(n) time complexity because all new elements are iterated and appended. It has O(n) space complexity to store the additional elements.

  • Accessing the entire array has O(n) time complexity because all elements are iterated and printed. It has O(1) space complexity since no additional memory is allocated for the operation.

  • Accessing a range of elements with slicing has O(n) time complexity because slicing involves copying these elements to a new list. It has O(n) space complexity as a new list is created to store the sliced elements.

  • Accessing a single element by index has O(1) time complexity because indexing directly retrieves the element. It has O(1) space complexity since no additional memory is allocated.


Now, let’s implement the search operation. We have used the in keyword with the if condition to check if the SEARCH_VALUE is present in the list at least once. We have used in with the for loop to get all the occurrences of the SEARCH_VALUE. Function enumerate() is a built-in function in Python which returns an enumerate object. We have used it to retrieve both the index and the element simultaneously. We may additionally update the found values by assigning new values.

# SEARCH & UPDATE
# Search the 1st occurence of a value
SEARCH_VALUE = 600

if SEARCH_VALUE in numbers:
    print(f"{SEARCH_VALUE} found at index {numbers.index(SEARCH_VALUE)}")
    # Update the found value to 20
    numbers[numbers.index(SEARCH_VALUE)] = 2
else:
    print(f"{SEARCH_VALUE} not found")

print("Updated array: ", numbers, end="\n\n")

# Search all occurences of a value
SEARCH_VALUE = 2

for index, element in enumerate(numbers):
    if element == SEARCH_VALUE:
        print(f"{SEARCH_VALUE} found at index {index}")

  • Search operation has O(n) time complexity as the list is searched sequentially. It has O(1) space complexity since no additional memory is used during the search.

Similar to the search operation, we can use the in keyword with if condition to check if the DELETE_VALUE is present at least once and we can delete the value using remove() function. If we want to remove multiple occurrences we may use in with iterators. We could create a new list and use a for loop to iterate through the original list, appending elements that are not equal to the DELETE_VALUE to the new list. Moreover, for in-place deletion of multiple occurrences, we can use a two pointer approach. The counter variable keeps track of the position in the list where the next non-DELETE_VALUE item should be placed and the index variable is used to access each element. Finally the del keyword is used to remove the leftover elements that were not overwritten.

# DELETE
# Delete the 1st occurence of a value
DELETE_VALUE = 150

if DELETE_VALUE in numbers:
    numbers.remove(DELETE_VALUE)
    print(f"\n{DELETE_VALUE} removed")
else:
    print(f"\n{DELETE_VALUE} not found")

# Delete all occurences of a value
DELETE_VALUE = 2
result = []

for element in numbers:
    if element != DELETE_VALUE:
        result.append(element)

print("New array after deleting: ", result, "\n")

# Delete all occurences of a value - inplace - two pointer
# DELETE_VALUE = 2
# counter = 0
#
# for index in range(len(numbers)):
#     if numbers[index] != DELETE_VALUE:
#         numbers[counter] = numbers[index]
#         counter += 1
#
# del numbers[counter:]   # Trim the list to exclude remaining elements

print("Final array: ", numbers)

  • Deleting an element has O(n) time complexity because the list is searched sequentially to locate the element, and elements after the removed item need to be shifted. It has O(1) space complexity since no extra memory is allocated.

  • Deleting multiple elements with a separate list has O(n) time complexity and O(n) space complexity, because a new list result is created, which may contain all elements of the original list (in the worst case of no elements are deleted).

  • Deleting multiple elements with two pointer approach has O(n) time complexity and O(1) space complexity since the list is traversed exactly once and no additional space is required.


Complete Code Example

# CREATE
numbers = list()    # numbers = []


# INSERT
# Insert a range of values
for i in range(1, 11, 2):     # Odd numbers from 1 to 10
    numbers.append(i)
# Insert a value at specific index
numbers.insert(0, 150) # Insert 150 at the 0th index
# Insert a value at the end
numbers.append(600)
# Insert multiple values at the end
numbers.extend([2, 2, 2, 2, 2])


# RETRIEVE
# Read entire array
print("Initial array:", numbers)
# Read a range of values
print("Sliced array:", numbers[2:-2]) # 3rd to 2nd last elements 
# Read single value
print("1st element: ", numbers[0])
print("Last element:", numbers[-1], end="\n\n")


# SEARCH & UPDATE
# Search the 1st occurence of a value
SEARCH_VALUE = 600

if SEARCH_VALUE in numbers:
    print(f"{SEARCH_VALUE} found at index {numbers.index(SEARCH_VALUE)}")
    # Update the found value to 20
    numbers[numbers.index(SEARCH_VALUE)] = 2
else:
    print(f"{SEARCH_VALUE} not found")

print("Updated array: ", numbers, end="\n\n")

# Search all occurences of a value
SEARCH_VALUE = 2

for index, element in enumerate(numbers):
    if element == SEARCH_VALUE:
        print(f"{SEARCH_VALUE} found at index {index}")


# DELETE
# Delete the 1st occurence of a value
DELETE_VALUE = 150

if DELETE_VALUE in numbers:
    numbers.remove(DELETE_VALUE)
    print(f"\n{DELETE_VALUE} removed")
else:
    print(f"\n{DELETE_VALUE} not found")

# Delete all occurences of a value
DELETE_VALUE = 2
result = []

for element in numbers:
    if element != DELETE_VALUE:
        result.append(element)

print("New array after deleting: ", result, "\n")

# Delete all occurences of a value - inplace - two pointer
# DELETE_VALUE = 2
# counter = 0
#
# for index in range(len(numbers)):
#     if numbers[index] != DELETE_VALUE:
#         numbers[counter] = numbers[index]
#         counter += 1
#
# del numbers[counter:]   # Trim the list to exclude remaining elements

print("Final array: ", numbers)


Comments

Popular posts from this blog

OOP with Python - Part 2

OOP with Python - Part 1