Content is user-generated and unverified.

    Arrays in R

    What is an Array?

    An array is a multi-dimensional data structure that can have more than two dimensions (unlike matrices which are strictly two-dimensional). Think of arrays as extending the concept of matrices to three, four, or more dimensions. All elements in an array must be of the same data type.

    Key characteristics:

    • Multi-dimensional (can have 3, 4, 5, or more dimensions)
    • All elements must be the same data type (homogeneous)
    • Elements are arranged in a rectangular structure across multiple dimensions
    • Each element can be accessed by specifying its position in each dimension

    Dimensional analogy:

    • Vector: 1-dimensional (a line of data)
    • Matrix: 2-dimensional (a table with rows and columns)
    • Array: 3+ dimensional (multiple tables stacked together, or higher dimensions)

    Creating Arrays

    Using the array() function:

    r
    # Create a 3-dimensional array: 2×3×4 (2 rows, 3 columns, 4 layers)
    data_vector <- 1:24
    my_array <- array(data_vector, dim = c(2, 3, 4))
    my_array
    , , 1
    
         [,1] [,2] [,3]
    [1,]    1    3    5
    [2,]    2    4    6
    
    , , 2
    
         [,1] [,2] [,3]
    [1,]    7    9   11
    [2,]    8   10   12
    
    , , 3
    
         [,1] [,2] [,3]
    [1,]   13   15   17
    [2,]   14   16   18
    
    , , 4
    
         [,1] [,2] [,3]
    [1,]   19   21   23
    [2,]   20   22   24

    Creating arrays with different dimensions:

    r
    # 4-dimensional array: 2×2×2×3
    four_d_array <- array(1:24, dim = c(2, 2, 2, 3))
    four_d_array
    , , 1, 1
    
         [,1] [,2]
    [1,]    1    3
    [2,]    2    4
    
    , , 2, 1
    
         [,1] [,2]
    [1,]    5    7
    [2,]    6    8
    
    , , 1, 2
    
         [,1] [,2]
    [1,]    9   11
    [2,]   10   12
    
    # ... and so on
    
    # Simple 3D array with specific values
    simple_array <- array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2))
    simple_array
    , , 1
    
         [,1] [,2]
    [1,]    1    3
    [2,]    2    4
    
    , , 2
    
         [,1] [,2]
    [1,]    5    7
    [2,]    6    8

    Creating arrays from matrices:

    r
    # Create multiple matrices and combine them into an array
    matrix1 <- matrix(1:6, nrow = 2, ncol = 3)
    matrix2 <- matrix(7:12, nrow = 2, ncol = 3)
    matrix3 <- matrix(13:18, nrow = 2, ncol = 3)
    
    # Combine matrices into a 3D array
    combined_array <- array(c(matrix1, matrix2, matrix3), dim = c(2, 3, 3))
    combined_array
    , , 1
    
         [,1] [,2] [,3]
    [1,]    1    3    5
    [2,]    2    4    6
    
    , , 2
    
         [,1] [,2] [,3]
    [1,]    7    9   11
    [2,]    8   10   12
    
    , , 3
    
         [,1] [,2] [,3]
    [1,]   13   15   17
    [2,]   14   16   18

    Understanding Array Structure

    Checking array properties:

    r
    my_array <- array(1:24, dim = c(2, 3, 4))
    
    # Check if it's an array
    is.array(my_array)
    [1] TRUE
    
    # Get dimensions
    dim(my_array)
    [1] 2 3 4
    
    # Get number of dimensions
    length(dim(my_array))
    [1] 3
    
    # Check data type
    class(my_array)
    [1] "array"
    
    typeof(my_array)
    [1] "integer"
    
    # Get structure overview
    str(my_array)
     int [1:2, 1:3, 1:4] 1 2 3 4 5 6 7 8 9 10 ...
    
    # Total number of elements
    length(my_array)
    [1] 24

    Understanding dimensions:

    r
    # For a 3D array with dim = c(2, 3, 4):
    # - 1st dimension: 2 rows
    # - 2nd dimension: 3 columns  
    # - 3rd dimension: 4 layers/slices
    
    # Think of it as 4 matrices, each with 2 rows and 3 columns

    Accessing Array Elements (Indexing)

    Individual elements:

    r
    my_array <- array(1:24, dim = c(2, 3, 4))
    
    # Access element at row 1, column 2, layer 3
    my_array[1, 2, 3]
    [1] 15
    
    # Access element at row 2, column 1, layer 1
    my_array[2, 1, 1]
    [1] 2

    Slicing arrays (getting subsets):

    r
    # Get entire layer (matrix) 2
    my_array[, , 2]
         [,1] [,2] [,3]
    [1,]    7    9   11
    [2,]    8   10   12
    
    # Get row 1 from all layers
    my_array[1, , ]
         [,1] [,2] [,3] [,4]
    [1,]    1    7   13   19
    [2,]    3    9   15   21
    [3,]    5   11   17   23
    
    # Get column 2 from all layers
    my_array[, 2, ]
         [,1] [,2] [,3] [,4]
    [1,]    3    9   15   21
    [2,]    4   10   16   22
    
    # Get specific rows and columns from layer 3
    my_array[1:2, 1:2, 3]
         [,1] [,2]
    [1,]   13   15
    [2,]   14   16
    
    # Get multiple layers
    my_array[, , c(1, 3)]
    , , 1
    
         [,1] [,2] [,3]
    [1,]    1    3    5
    [2,]    2    4    6
    
    , , 3
    
         [,1] [,2] [,3]
    [1,]   13   15   17
    [2,]   14   16   18

    Working with 4D arrays:

    r
    four_d <- array(1:48, dim = c(2, 3, 2, 4))
    
    # Access single element: row 1, col 2, layer 1, "book" 3
    four_d[1, 2, 1, 3]
    [1] 27
    
    # Get entire 3D "slice" from the 4th dimension
    four_d[, , , 2]
    , , 1
    
         [,1] [,2] [,3]
    [1,]   13   15   17
    [2,]   14   16   18
    
    , , 2
    
         [,1] [,2] [,3]
    [1,]   19   21   23
    [2,]   20   22   24

    Adding Names to Array Dimensions

    Naming dimensions:

    r
    my_array <- array(1:24, dim = c(2, 3, 4))
    
    # Add dimension names
    dimnames(my_array) <- list(
      c("Row1", "Row2"),                    # 1st dimension (rows)
      c("Col1", "Col2", "Col3"),           # 2nd dimension (columns)
      c("Layer1", "Layer2", "Layer3", "Layer4")  # 3rd dimension (layers)
    )
    
    my_array
    , , Layer1
    
         Col1 Col2 Col3
    Row1    1    3    5
    Row2    2    4    6
    
    , , Layer2
    
         Col1 Col2 Col3
    Row1    7    9   11
    Row2    8   10   12
    
    # ... and so on
    
    # Access by name
    my_array["Row1", "Col2", "Layer3"]
    [1] 15
    
    my_array[, , "Layer1"]
         Col1 Col2 Col3
    Row1    1    3    5
    Row2    2    4    6

    Setting names during creation:

    r
    named_array <- array(1:12, 
                        dim = c(2, 3, 2),
                        dimnames = list(
                          Subjects = c("Person1", "Person2"),
                          Tests = c("Math", "Science", "English"),
                          Time = c("Before", "After")
                        ))
    named_array
    , , Time = Before
    
            Tests
    Subjects Math Science English
      Person1    1       3       5
      Person2    2       4       6
    
    , , Time = After
    
            Tests
    Subjects Math Science English
      Person1    7       9      11
      Person2    8      10      12

    Modifying Arrays

    Changing individual elements:

    r
    my_array <- array(1:24, dim = c(2, 3, 4))
    
    # Change single element
    my_array[1, 2, 3] <- 999
    my_array[1, 2, 3]
    [1] 999
    
    # Change entire layer
    my_array[, , 1] <- matrix(c(100, 200, 300, 400, 500, 600), nrow = 2, ncol = 3)
    my_array[, , 1]
         [,1] [,2] [,3]
    [1,]  100  300  500
    [2,]  200  400  600
    
    # Change all elements in specific positions across all layers
    my_array[1, 1, ] <- c(10, 20, 30, 40)
    my_array[1, 1, ]
    [1] 10 20 30 40

    Array Arithmetic

    Element-wise operations:

    r
    array1 <- array(1:8, dim = c(2, 2, 2))
    array2 <- array(9:16, dim = c(2, 2, 2))
    
    # Addition
    result_add <- array1 + array2
    result_add
    , , 1
    
         [,1] [,2]
    [1,]   10   14
    [2,]   12   16
    
    , , 2
    
         [,1] [,2]
    [1,]   18   22
    [2,]   20   24
    
    # Multiplication (element-wise)
    result_mult <- array1 * array2
    result_mult
    , , 1
    
         [,1] [,2]
    [1,]    9   35
    [2,]   20   48
    
    , , 2
    
         [,1] [,2]
    [1,]   65   91
    [2,]   78  112
    
    # Operations with single values
    array1 + 10
    , , 1
    
         [,1] [,2]
    [1,]   11   13
    [2,]   12   14
    
    , , 2
    
         [,1] [,2]
    [1,]   15   17
    [2,]   16   18

    Useful Array Functions

    Apply functions across dimensions:

    r
    my_array <- array(1:24, dim = c(2, 3, 4))
    
    # Apply function across different dimensions
    # MARGIN = 1: across rows (within columns and layers)
    apply(my_array, 1, sum)  # Sum each row across all columns and layers
    [1] 156 168
    
    # MARGIN = 2: across columns (within rows and layers)  
    apply(my_array, 2, sum)  # Sum each column across all rows and layers
    [1] 52 68 84
    
    # MARGIN = 3: across layers (within rows and columns)
    apply(my_array, 3, sum)  # Sum each layer across all rows and columns
    [1] 21 57 93 129
    
    # MARGIN = c(1,2): preserve rows and columns, sum across layers
    apply(my_array, c(1, 2), sum)
         [,1] [,2] [,3]
    [1,]   40   48   56
    [2,]   44   52   60
    
    # Apply custom functions
    apply(my_array, 3, mean)  # Mean of each layer
    [1]  3.5  9.5 15.5 21.5
    
    apply(my_array, c(1, 2), max)  # Maximum across layers for each row/column combo
         [,1] [,2] [,3]
    [1,]   19   21   23
    [2,]   20   22   24

    Other useful functions:

    r
    # Get margins (dimension sizes)
    margins <- dim(my_array)
    margins
    [1] 2 3 4
    
    # Convert to vector (flattened)
    as.vector(my_array)
    [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
    
    # Get dimension names
    dimnames(my_array)
    NULL  # (if no names were set)

    Practical Examples

    Example 1: Student test scores over time

    r
    # Scores for 3 students, 4 subjects, across 2 semesters
    scores <- array(
      data = c(85, 90, 78,    # Math, Semester 1
               88, 92, 80,    # Science, Semester 1  
               82, 87, 75,    # English, Semester 1
               79, 85, 72,    # History, Semester 1
               87, 93, 80,    # Math, Semester 2
               90, 95, 83,    # Science, Semester 2
               84, 89, 77,    # English, Semester 2
               81, 87, 74),   # History, Semester 2
      dim = c(3, 4, 2),
      dimnames = list(
        Students = c("Alice", "Bob", "Carol"),
        Subjects = c("Math", "Science", "English", "History"),
        Semester = c("Fall", "Spring")
      )
    )
    
    # View all data
    scores
    
    # Get Alice's scores across all subjects and semesters
    scores["Alice", , ]
    
    # Get Math scores for all students in Spring semester
    scores[, "Math", "Spring"]
    
    # Calculate average score per student across all subjects and semesters
    apply(scores, 1, mean)

    Example 2: Image data (RGB values)

    r
    # Create a simple 4×4 pixel image with RGB channels
    # Dimensions: height × width × color_channels
    image_data <- array(
      data = sample(0:255, 48, replace = TRUE),  # Random pixel values
      dim = c(4, 4, 3),  # 4×4 pixels, 3 color channels (R, G, B)
      dimnames = list(
        Height = paste0("Row", 1:4),
        Width = paste0("Col", 1:4),
        Channel = c("Red", "Green", "Blue")
      )
    )
    
    # View red channel
    image_data[, , "Red"]
    
    # Get RGB values for pixel at row 2, column 3
    image_data[2, 3, ]
    
    # Calculate average intensity for each color channel
    apply(image_data, 3, mean)

    Converting Between Data Types

    Array to matrix (for 2D arrays):

    r
    two_d_array <- array(1:12, dim = c(3, 4))
    as.matrix(two_d_array)
         [,1] [,2] [,3] [,4]
    [1,]    1    4    7   10
    [2,]    2    5    8   11
    [3,]    3    6    9   12

    Array to vector:

    r
    my_array <- array(1:8, dim = c(2, 2, 2))
    as.vector(my_array)
    [1] 1 2 3 4 5 6 7 8

    Reshaping arrays:

    r
    # Change dimensions while keeping the same data
    original <- array(1:24, dim = c(2, 3, 4))
    reshaped <- array(original, dim = c(4, 6, 1))  # Reshape to 4×6×1
    dim(reshaped)
    [1] 4 6 1

    Key Points to Remember

    1. Arrays extend matrices - matrices are 2D, arrays can be 3D, 4D, or higher
    2. All elements same type - homogeneous data structure
    3. Dimensional indexing - use [dim1, dim2, dim3, ...] notation
    4. Leave dimensions empty - use [, , 2] to get entire layer 2
    5. apply() is powerful - specify MARGIN to control which dimensions to apply functions across
    6. Dimension order matters - typically rows, columns, then higher dimensions
    7. Memory considerations - arrays can get very large very quickly
    8. Names improve readability - use dimnames() for better code clarity
    9. Think in slices - 3D arrays are like stacks of matrices
    10. Real-world applications - common in image processing, time series analysis, and scientific computing

    Arrays are powerful for handling complex multi-dimensional data but can be memory-intensive. They're particularly useful in scientific computing, image processing, and when you need to organize data across multiple categorical dimensions simultaneously.

    Content is user-generated and unverified.