Python csv with quotes

Python CSV writer, how to handle quotes in order to avoid triple quotes in output

I am working with Python’s CSV module, specifically the writer. My question is how can I add double quotes to a single item in a list and have the writer write the string the same way as a print statement would? for example:

import csv #test "data" test = ['item1','01','001',1] csvOut = csv.writer(open('file.txt','a')) #'a' used for keeping past results test[1] = '"'+test[1]+'"' print test #prints: ['item1', '"01"', '001', 1] csvOut.writerow(test) #written in the output file: item1,"""01""",001,1 #I was expecting: item1,"01",001,1 del csvOut 

I tired adding a quoting=csv.QUOTE_NONE option, but that raised an error. I am guessing this is related to the many csv dialects, I was hoping to avoid digging too far into that. In retrospect I could probably have built my initial data set smarter and perhaps avoided the need for this situation but at this point curiosity is really getting the better of me (this is a simplified example): how do you keep the written output from adding those extra quotes?

Writing output this way is part of generating a valid CSV that represents your data the way you’ve specified. Generating an invalid CSV file (or one that writes quotes-as-syntax when you’ve passed in quotes-as-data) would be a bug, not a feature.

What do you want it to look like? And what other CSV-handling code do you want to interoperate with? (If you’re creating files to be fed into Excel, or np.loadtxt , or whatever, your output had better be readable by that other program.)

Читайте также:  Android gradle plugin requires java 11 to run you are currently using

The existing answers cover it, but to be perfectly clear it’s worth noting that if you read the output back in with csv.reader you will get your original string with the quotes back. Try it.

@JasonS: Good point; I probably didn’t need to bring up Excel or np.loadtxt ; the most common thing you want to interoperate with is your own code. 🙂

I was trying to make this interoperable with other software down the line but didn’t want to muddy the waters. I was also trying to understand what was going on (but wanted to ask a specific answerable question). In case others stumble on this, if you’re interested in better interoperability with other programs google for things like «schema.ini» and «csvt file extension» (solved my ultimate problem, but answer below still helped)

2 Answers 2

It’s not actually triple-quoting, although it looks that way. Try it with another example to see:

Now you’ll see that it writes this:

In other words, it’s just wrapping quotes around your string, and escaping the literal quote characters by doubling them, because that’s how default Excel-style CSV handles quote characters.

The question is, what format do you want here? Almost anything you want (within reason) is doable, but you have to pick something. Backslash-escaping quotes? Backslash-escaping everything instead of using quotes in the first place? Single quotes instead of double quotes?

For example, this looks like an answer:

csvOut = csv.writer(open('file.txt','a'), quotechar="'") 

… until you have an item like Filet O’Fish and the whole thing gets single-quoted and the ‘ gets doubled and you have the exact same problem you were trying to avoid. If you’re aiming for human readability, and ‘ is a lot less common in your data than » , that may actually be the right answer, but it’s not a perfect answer.

And really, no answer can be perfect: you need some way to either quote or escape commas—and other things, like newlines—and the way you do that is going to add at least one more character that needs to be quote-doubled or escaped. If you know there are never any commas, newlines, etc. in your data, and there’s at least one other character you know will never show up, you can get away with setting either quotechar to that other character, or escapechar to that other character and quoting=QUOTE_NONE . But the first time someone unexpectedly uses the character you were sure would never appear, your code will break, so you’d better actually be sure.

Источник

csv writer in Python with custom quoting

I’m looking for a way to define custom quoting with csv.writer in Python. There are 4 built-in ways to qoute values:

csv.QUOTE_ALL, csv.QUOTE_MINIMAL, csv.QUOTE_NONNUMERIC, csv.QUOTE_NONE 

However I need a quoting mechanism which will emulate Postgres’ FORCE QUOTE * , i.e. it will quote all non-None values. With csv.QUOTE_ALL Python will turn None into » but I would like to have empty string instead. Is it possible to do that with built-in csv module ( I’m not interested in hacks, I’m already doing that 😛 )? Or am I forced to write/get some custom csv parser? And generally: is it possible to write custom quoting mechanism for csv module?

2 Answers 2

Disable csv quoting and add the quotes yourself:

def quote(col): if col is None: return '' # uses double-quoting style to escape existing quotes return '"<>"'.format(str(col).replace('"', '""')) writer = csv.writer(fileobj, quoting=csv.QUOTE_NONE, escapechar='', quotechar='') for row in rows: writer.writerow(map(quote, row)) 

By setting both escapechar and quotechar to empty strings you avoid the module quoting your already-quoted values.

The above works as long as you don’t use the delimiter in the csv values.

Note that by this time it would just be easier to write comma-delimited lines yourself:

with open(filename, 'w'), fd: for row in rows: fd.write(','.join(map(quote, row)) + '\r\n') 

Already tried that: the problem is that you have to specify escapechar with QUOTE_NONE and then it escapes quotes.

Ah, very interesting. However I do need to escape delimiters. I think I’ll just stick with my custom implementation. 🙂 Thank you.

I’m surprised it is not possible to specify a custom lambda function to handle the quoting. Unless I misunderstand what quoting is supposed to do.

I’ve written my own csv writer which does exactly what I want:

class PostgresCSVWriter(object): def __init__(self, stream, quotechar="\"", delimiter=",", escapechar="\\"): self.stream = stream self.quotechar = quotechar self.delimiter = delimiter self.escapechar = escapechar self.buffer_size = 16384 def _convert_value(self, obj): if obj is None: return "" value = str(obj) value = value.replace(self.quotechar, self.quotechar+self.quotechar) value = value.replace(self.delimiter, self.escapechar+self.delimiter) return self.quotechar+value+self.quotechar def _convert_row(self, row): return self.delimiter.join(self._convert_value(v) for v in row) + "\r\n" def writerow(self, row): self.stream.write(self._convert_row(row)) def writerows(self, rows): data = "" counter = 0 for row in rows: buf = self._convert_row(row) data += buf counter += len(buf) if counter >= self.buffer_size: self.stream.write(data) data = "" counter = 0 if data: self.stream.write(data) 

If anyone sees any problem with it, then please let me know. I’m still looking for a solution with csv module though.

Источник

How do I put single-quotes around string using Python csv writer?

Python has more than one quoting style; you can use either «‘» , or explicitly escape the quote with ‘\» .

I can’t seem to find anything that gives a ‘real’ csv output using the csv.writer stuff. The ‘hacky’ way is to just write the emails to a file adding the necessary characters in the write statement. See my edit.

4 Answers 4

Change your quote characters as noted by others or use a backslash escape.

 email_writer = csv.writer(csvfile, quotechar='\'', quoting=csv.QUOTE_NONNUMERIC,delimiter=',') 

We seemed to have missed your second problem:

This is a hack because I can’t seem to figure out how to do this with the csv module. Rather than writing your files using the csv.writer , you can try it this way:

for control_rent_email in control_group_renters: csvfile.write('\'' + control_rent_email + '\'' + ',') 

You also don’t need to explicitly close the file you’re writing to as the way you’re opening it does it when it isn’t referenced anymore.

I hope someone out there can figure out how to do this in a less-hack sort of way, particularly using the csv.writer . This way will suit your needs though.

@alexis So it’s not just me then? I searched and tested for a bit to see if it was possible but I couldn’t come up with a solution other than what I put above. I would think the csv module would have something that would do this easily.

The CSV module is for producing variants of the CSV format. Did you notice that (s)he doesn’t want a comma after the last line? And why force the quotes around strings that don’t need to be quoted, but have a trailing comma? That’s not CSV. You can do it by using QUOTE_NONE , inserting the surrounding quotes by hand (as you did above), and passing an empty string to trigger a trailing comma (see my answer). Your solution is far more appropriate, as I said.

A csv writer is a device you can use to write multiple lines into a file. You should only create one. Tell it to use a single quote as your quote character, and use it this way:

with open("emails.csv", "w") as csvfile: email_writer = csv.writer(csvfile, quotechar="'", quoting=csv.QUOTE_NONNUMERIC,delimiter=',') for control_rent_email in control_group_renters: email_writer.writerow([control_rent_email]) 

I hope that you’re giving a simplified example of what you really need; otherwise, why go to all this trouble when you’re only printing one string per line? If you really want a trailing comma on each line except the last, you’re really not aiming for any sort of CSV and you should construct your outputs directly.

That said, you can get trailing commas by passing an empty string to writerow , but only if you use QUOTE_MINIMAL or QUOTE_NONE (which will suppress the pointless quotes around the emails), instead of QUOTE_NONNUMERIC ; otherwise you’ll get quotes around the empty strings.

email_writer.writerow([control_rent_email, ""]) 

For completeness, here’s how you can really generate exactly the output you want, without the final comma and all:

with open("emails.txt", "w") as output: output.write( ",\n".join("'"+r+"'" for r in control_group_renters) +"\n" ) 

Источник

Writing csv with quotes around strings (Python)

I have written the following code to take a large csv file, and split it into multiple csv files based on a particular word in a column. The original csv file has some fields that are strings, and they have quotes around them. For example:

Field1,Field2,Field3,Field4 1,2,"red",3 1,4,"red",4 3,4,"blue",4 
3.csv Field1,Field2,Field3,Field4 1,2,red,3 4.csv Field1,Field2,Field3,Field4 1,4,red,4 3,4,blue,4 

I want my output to maintain quotes around the strings in field 3. The files are fed into a piece of software that only works if strings have quotes around them, which is quite annoying. My current code looks like this:

import csv #Creates empty set - this will be used to store the values that have already been used newfilelist = set() #Opens the large csv file in "read" mode with open('File.csv', 'r') as csvfile: #Read the first row of the large file and store the whole row as a string (headerstring) read_rows = csv.reader(csvfile) headerrow = next(read_rows) headerstring=','.join(headerrow) for row in read_rows: #Store the whole row as a string (rowstring) rowstring=','.join(row) #Takes Field 4 newfilename = (row[3]) #This basically makes sure it is not looking at the header row. if newfilename != "field4": #If the newfilename is not in the newfilename set, add it to the list and create new csv file with header row. if newfilename not in newfilelist: newfilelist.add(newfilename) with open('//output/' +str(newfilename)+'.csv','a') as f: f.write(headerstring) f.write("\n") f.close() #If the newfilename is in the newfilelist set, append the current row to the existing csv file. else: with open('//output/' +str(newfilename)+'.csv','a') as f: f.write(rowstring) f.write("\n") f.close() 

Can anybody advise me how to get the quotes around the strings? Unfortunately the software that uses my files requires them to be in this format!

Источник

Оцените статью