Friday, May 15, 2015

Matrix using Python


My second post regarding data structure using Python is about 2 dimension array and matrix.

For 2D array, naturally we can think of using nested lists to implement it, here is the class: 



class Array2d:
    def __init__(self, numRows, numCols):
        self._theRows = [None] * numRows
        for i in range (numRows):
            self._theRows[i] = [None] * numCols
    #return the number of rows
    def numRows(self):
        return len(self._theRows)

    #return number of columns
    def numCols(self):
        return len(self._theRows[0])

    def clear(self):
        for row in range(self.numRows()):
            row.clear()

    #initialize all 2d array to a value
    def initialize(self, value):
        for row in range(self.numRows()):
            for col in range(self.numCols()):
                self[row, col] = value

 
    def __getitem__(self, key):
        assert len(key) == 2, "Invalid number of coordinates"
        row = key[0]
        col = key[1]
        assert 0 <= row < self.numRows() \
            and 0 <= col < self.numCols(), \
            "Array subscript out of range."
        return self._theRows[row][col]

    def __setitem__(self, key, value):
        assert len(key) == 2, "Invalid number of coordinates"
        row = key[0]
        col = key[1]
        assert 0 <= row < self.numRows() \
            and 0 <= col < self.numCols(), \
            "Array subscript out of range."
        self._theRows[row][col] = value




The matrix class is implemented from the above Array2d class:


from src import Array2D

class Matrix:

    def __init__(self, numRows, numCols,initialValue):
        self._matrix = Array2D.Array2d(numRows, numCols)
        self._matrix.initialize(initialValue)

    def numRows(self):
        return self._matrix.numRows()

    def numCols(self):
        return self._matrix.numCols()

    def __getitem__(self, key):
        return self._matrix.__getitem__(key)

    def __setitem__(self, key, value):
        return self._matrix.__setitem__(key, value)

    def scaleBy(self,scalar):
        for row in range(self.numRows()):
            for col in range(self.numCols()):
                self[row, col] *= scalar

    def transpose(self):
        trans = Matrix(self.numCols(),self.numRows(), 1)
        for r in range(self.numCols()):
            for c in range(self.numRows()):
                trans[r, c] = self[c, r]
        return trans

    def __add__(self, matrixB):
        assert matrixB.numRows() == self.numRows() \
            and matrixB.numCols() == self.numCols(), \
            "Matrix sizes not compatible for the add operation"
        sumMatrix = Matrix(self.numRows(),self.numCols(),0)
        for r in range(self.numRows()):
            for c in range(self.numCols()):
                sumMatrix[r, c] = self[r, c] + matrixB[r, c]
        return sumMatrix

    def __sub__(self, matrixB):
        #indention after '\'
        assert matrixB.numRows() == self.numRows() \
            and matrixB.numCols() == self.numCols(), \
            "Matrix sizes not compatible for the subtraction operation"
        difference = Matrix(self.numRows(),self.numCols(),0)
        for r in range(self.numRows()):
            for c in range(self.numCols()):
                difference[r, c] = self[r, c] - matrixB[r, c]# = difference.__getitem__(r, c)
        return difference

    def __mul__(self, matrixB):
        assert matrixB.numRows() == self.numCols() \
            and matrixB.numCols() == self.numRows(), \
            "Matrix sizes not compatible for the subtraction operation"
        product = Matrix(self.numRows(), matrixB.numCols(), 0)
        for r in range(self.numRows()):
            for c in range(matrixB.numCols()):
                for n in range(self.numCols()):
                    product[r, c] += self[r, n] * matrixB[n, c]
        return product

    def print(self):
        for r in range(self.numRows()):
            for c in range(self.numCols()):
                print(self[r, c], end=' ')
            print()


This blog is more about the details of writing Python than the actual implementation of the two classes. Here are some of the notes:

1. Modules
Module is a familiar name in Python. Officially, it defines as:
A module is a file containing Python definitions and statements.a way to put definitions in a file and use them in a script or in an interactive instance of the interpreter. 
It would be easier to think it as just files you write your code.  There is no module in Java upon Java 8 ("In Java 9, "modules", a kind of collection of packages, are planned as part of Project jigsaw these were earlier called "superpackages" and originally planned for Java 7"), but Java uses "package" for similar purposes.

To create an object from a class written in a module, we have to call that class from the module. e.g.,

matrixA = Matrix.Matrix(rows, cols, 1)


This definitely is not a good naming strategy. I was following a tutorial in which they defined a class inside the module, I thought it was the same as in Java, apparently I was not right. However, there is an easier way, which is just writing functions inside the file without defining "class Matrix", I will remember next time. :)

2. __setitem__(self, key, value), __getitem__(self, key) and methods to emulating container types
Theres two methods are part of the methods that can be used to emulate containers (e.g., dictionary, list, etc. ). __getitem__ () can be called by self[key]. 
One easy way to set a value is to call:

self[key] = value

3. Explicit and implicit line joining
It is better for the length of a line of code to be within 80 characters, so when a line is too long, we need to separate it to two lines. In Python, two or more physical lines may be joined into logical lines using backslash ('\'). Note a line ending in a backslash cannot carry a comment. Indention is preferred after the first line.


assert matrixB.numRows() == self.numRows() \
            and matrixB.numCols() == self.numCols(), \
            "Matrix sizes not compatible for the subtraction operation"


Implicit line joining is used in parentheses, square brackets or curly braces and can carry comments in the end of the line.
Java does not require explicit line joining.

4. Comparison
 This is actually pretty cool. Finally we can write comparison the way we write on paper:

0 <= row < self.numRows()


Apparently Java does not allow us to write it in such way.


The tester code:

from src import Matrix

def main():
    if __name__ == "__main__":
        rows = 3
        cols = 2
        matrixA = Matrix.Matrix(rows, cols, 1)
        matrixB = Matrix.Matrix(cols, rows, 1)
        matrixC = Matrix.Matrix(rows, cols, 1)
        matrixA[0, 1] = 2
        matrixA[2, 1] = 3
        matrixB[1, 1] = 3
        matrixB[0, 2] = 5
        matrixC[0, 1] = 3
        matrixC[0, 1] = 1

        print("Matrix A", end='\n')
        matrixA.print()

        print("\nMatrix B", end='\n')
        matrixB.print()

        print("\nMatrix C", end='\n')
        matrixC.print()

        sumM = matrixA.__add__(matrixC)
        print("\nSum of A and C", end='\n')
        sumM.print()

        difM = matrixA.__sub__(matrixC)
        print("\nDifference between A and C", end='\n')
        difM.print()

        prodM = matrixA.__mul__(matrixB)
        print("\nProduct of A and B", end='\n')
        prodM.print()

        print("\nTranspose of A", end='\n')
        matrixA.transpose().print()


main()

More
I was going through the fancy Numpy's implementation of matrix, then I found the build in function __new__(). I thought that __init__() serves as the constructor role, apparently I was only partially right: they do together. __new__() is responsible of creating a new instance, it can return an instance of the class, while __init__() takes charge of customizing it. If __new__() returns an instance of the class, then the new instance's __init__() method will be invoked. Since __new__() and __init__() work together to construct new objects,  NO non - None value may be returned by __init__(). 

More2
I wanted to check Numpy's matrix dot product implementation, since my brutal force one requires O(n^3), apparently it turns out to be a more complicated subject. See here for the Stackoverflow discussion and here for the Numpy implementation. I am going to stop here since the ALTAS is apparently beyond my concern. :)


No comments:

Post a Comment