Saturday, December 4, 2010

Git Gotta

>> mkdir foo
>> git init
Initialized empty Git repository in /home/jianlu/repos/web/foo/.git
>> git branch master
fatal: Not a valid object name: 'master'.

// How to fix this?
// make an arbitrary checkin by git submit. You are done.

Update:
git commit --allow-empty -m "initial commit"

Source of this update: (thanks to Mohamed Meligy's tip)

>> git branch
master.

Git claims this to be by design because before the user makes the first initial commit, there is nothing to be branched.

This kinda makes sense but not always a valid restriction. It is not a rare case that you want to lay out your repository structure because you know the repro structure should be like that even before you write the first line of your project code.

and btw, this is a terrible error message. It barely says anything about the real cause of this failure, or any suggestion on possible fixes.

Sunday, November 28, 2010

Some random mysql pitfalls.

Constraints
Only PK, FK, Nullability, Unique are enforced in InnoDB. Check (expression) is not. So don't expect the following check constraint to help you catch any insertion/update with foo.value <= 20.



FK
both sides of your FK should have exactly the same signature. Otherwise you get errno 150, which indicates a syntax error in FK definition.

Monday, September 27, 2010

Uninstalling MySQL on Mac OS X

Source: http://steveno.wordpress.com/2009/03/26/uninstall-mysql-on-mac-os-x/

  • sudo rm /usr/local/mysql
  • sudo rm -rf /usr/local/mysql*
  • sudo rm -rf /Library/StartupItems/MySQLCOM
  • sudo rm -rf /Library/PreferencePanes/My*
  • edit /etc/hostconfig and remove the line MYSQLCOM=-YES-
  • sudo rm -rf /Library/Receipts/mysql*
  • sudo rm -rf /Library/Receipts/MySQL*

Tuesday, August 31, 2010

Arrange number 1..n so that each pair of consecutive numbers sum up to a square number

Problem Statement
Write a program to arrange numbers from 1 to n such that the sum of two consecutive numbers is a perfect square.

An interesting way of looking at this problem
Make a graph so that each node is one of the numbers and there is an edge between two nodes if these two numbers can add up to a square number. Find a path starting from one node and end up in another so that each node is visited once and all nodes are visited. Such path may not exist.

There should be an easier way to do this by seeking for nodes with only one edge in/out...

Wednesday, August 25, 2010

Given a dictionary of words and a string with all spaces removed, determine whether the string is composed of valid words

Problem Statement

Given a dictionary of words and a string with all spaces removed, determine whether the string is composed of valid words

Examples
helloworld-> hello world (valid)
isitniceinhere-> is it nice in here (valid)
zxyy-> invalid

Time Complexity Requirement
O(n^2)

Algorithm
True/False array T of size equal to the input. T[i] = true => the substring starting from i is composed of valid words.

Assume API method IsValidWord() is O(1) since we are given a dictionary.

for each element in the array:

T[i] = (IsValidWord(s.substring(i, 1)) && T[i+1])
|| (IsValidWord(s.substring(i, 2)) && T[i+2])
...
|| (IsValidWord(s.substring(i, n-i-1)) && T[n-1])

The result is T[0].

Complexity analysis
for each i, it takes O(n) to compute. and we start the computation from i = s.length - 1 and decrease i, and there are n element in the array to compute. So we get O(n^2) as the total time complexity.

Tuesday, August 24, 2010

Given two strings of a and b. Find the max window in a and b where the substrings in the window are anagrams.

Problem description:

1. Given two strings a and b, find the max value of "length" where for some i and j, a.substring(i, i+length) and b.substring(j, j+length) are anagrams.

2. Given two strings a and b, find the max window of matching patterns between a and b where the pattern matching doesn't have to be an exact match but also can be a permutation of each other. (which is essentially anagrams...)

Algorithm:

let m = a.length;
let n = b.length;
let matrix = int[m][n]

if a[i] == b[j] then matrix[i][j] = 1
else matrix[i][j] = 0

With the matrix populated by the routines above, find the largest square that meets the following two conditions:
  • no column contains all 0s. For each column, at least one cell is 1
  • no rows contains all 0s
The time complexity is O(n^2). I don't believe it can be done in O(nlogn). But I could be wrong.

Someone was asked the following question during interview: if m = n, would that help speeding up the complexity?

Well, I have no idea...

Thursday, July 8, 2010

Checking whether a binary tree is balanced or not using recursion

Balanced: no path from root to leaf node is longer than another such path by 1.

The code is just so elegant...



this can be extended to general trees with more than two children by examining all children's maxDepth or minDepth.

ArrayList vs. LinkedList

There has been quite a bit discussion already. So just sum up by a few words:

ArrayList
  • Lookup by index is O(1). (in Java, it implements RandomAccess) Suitable for dataset with random access pattern.
  • Insertion can be expensive. Worst case O(n).
    1. Insertion to the middle can cause shifting.
    2. Insertion to the end can cause changing in the size of the internal storage, usually doubling.
  • Deletion Depending on the implementation, it may vary a bit. Generally not bad.
LinkedList
  • Lookup by index is O(n) in worst case. (in Java, it implements Queue). Suitable for data set that most of operations against it is iteration, or access mostly happens at the beginning or the end, such as FIFO queues, FILO stacks.
  • Insertion is O(1)
  • Deletion
    1. Deletion by reference is O(1)
    2. Deletion by value is as bad as lookup.
General thoughts:

Comparing two data structures, always at least think from two perspectives:
  1. operations: lookup, insertion, deletion, manipulation.
  2. input data characteristics: matching the characteristics with the cost of each operation to find the better data structure to solve your problem.

Hashtable

Disadvantage: generally, it wastes spaces.
Advantage: generally, it does fast lookup.

Three normal types: Chained, Chained Scattered, Open Scatter



Tuesday, July 6, 2010

Finding all numbers which are smaller than K and only divisible by 2, 3, and 5

This may not be the most optimal solution...

Essentially we are looking for all possible {(x, y, z) | (2^x * 3^y * 5^z) <= K}
  1. When x+y+z = 1, we get {2, 3, 5},

  2. Let:

    S(i) = { (2^x * 3^y * 5^z) | (2^x * 3^y * 5^z) < K and (x + y + z = i)}

    S(i, j) = Union of all s in S(i-1, t) * j that:
    a. t and j belong to {2, 3, 5}
    b. t >= j
    c. s <= K

    for example:
    S(2,2) = {2, 3, 5} * 2 = {4, 6, 10}
    S(2,3) = {3, 5} * 3 = {9, 15}
    S(2,5) = {5} * 5 = {15}

  3. S(i) = S(i-1, 2) + S(i -1, 3) + S(i-1, 5).

    for example:
    S(2) = S(2,2) + S(2,3) + S(2, 5)

  4. Stops when S(i) is empty.