Unicode csv files in Python 2.x

In some recent web scraping projects I extracted some data from a HTML document and saved it in a .csv file, using the csv module in Python. I used the BeautifulSoup module to parse and navigate the HTML, and since BS always encodes text in unicode, there was some real hassle when I tried to write special (non-ASCII) characters to the csv file since the csv module does not support unicode.

The documentation to the csv module provides some solutions to the problem, but I found that the easiest solution was to just install jdunck’s unicodecsv module. It has the same interface as the regular csv module, which is great. This means that if you already have a script that uses the regular module you can just write import unicodecsv as csv (or whatever you imported csv as) and it should work.

I guess Python 3.x does not have this problem since all strings by default are unicode strings.

A brainfuck interpreter in Python.

So I implemented this small brainfuck interpreter in python. This implementation uses 8-bit memory cells and does not allow for for values outside the 0-255 range. If this happens a ValueError will be raised. By default, the memory tape is 5000 cells long, but this can be set by the user.

Python 2.6 or later is required.

Example:

bf = Brainfuck(100) #only 100 cells on the tape
twoPlusThree = "++>+++<[>+<-]"
bf.run_command(twoPlusThree)

And here is the implementation:

class Brainfuck:
	def __init__(self, tape_length=5000):
		self.tape = bytearray(tape_length)
		self.tape_pointer = 0
		
	def run_command(self, cmd):
		cmd_pointer = 0

		running = True
		while running:
			if self.tape_pointer > len(self.tape) or self.tape_pointer < 0 or cmd_pointer < 0 or cmd_pointer > len(cmd) -1:
				break

			if cmd[cmd_pointer] == "+":
				self.tape[self.tape_pointer] += 1
			elif cmd[cmd_pointer] == "-":
				self.tape[self.tape_pointer] -= 1
			elif cmd[cmd_pointer] == "<":
				self.tape_pointer -= 1
			elif cmd[cmd_pointer] == ">":
				self.tape_pointer += 1
			elif cmd[cmd_pointer] == ".":
				print(chr(self.tape[self.tape_pointer]))
			elif cmd[cmd_pointer] == ",":
				self.tape[self.tape_pointer] = int(raw_input())
			elif cmd[cmd_pointer] == "[":
				if int(self.tape[self.tape_pointer]) == 0:
					lbcounter = 0
					searching = True			
					while searching:
						cmdpointer +=1
						if cmd[cmd_pointer] == "[":
							lbcounter += 1
						elif lbcounter == 0 and cmd[cmd_pointer] == "]":
							searching = False 
						elif cmd[cmd_pointer] == "]":
							lbcounter -= 1
			elif cmd[cmd_pointer] == "]":
				if int(self.tape[self.tape_pointer]) != 0:
					rbcounter = 0
					searching = True			
					while searching:
						cmd_pointer -=1
						if cmd[cmd_pointer] == "]":
							rbcounter += 1
						elif rbcounter == 0 and cmd[cmd_pointer] == "[":
							searching = False
						elif cmd[cmd_pointer] == "[":
							rbcounter -= 1
			
			cmd_pointer += 1