poly4life Posted March 24, 2011 Posted March 24, 2011 Hello, I'm having difficulty reading from and writing to the same file. I'm not sure what I'm doing wrong. It works fine if I'm in read/write append mode, but that doesn't accomplish what I want. I'm trying to read the file in, do some processing on it, then overwrite the file. I honestly don't understand the purpose of "+> operator". "+<" works fine, but it appends to the file. "+>" deletes the contents of the file immediately after I open it. Is it that the "+>" operator is simply not the right tool for reading from and (over)writing to the same file? I saw a perl module called Tie::File but I am having difficulty using it and would appreciate some help with it. Otherwise, I'll just read from the file, close it, and then write to the file, if I can't get the "+>" operator working for this problem.Thank you.Here's my code: #!/usr/local/bin/perl -w use Fcntl qw(:DEFAULT :flock :seek); # import LOCK_* constants local $/=undef;#Read/Write with overwriteopen(FILE, "+>", $file) || die("Cannot open file"); flock(FILE, LOCK_EX); seek(FILE, 0, SEEK_SET); $file_data=<FILE>; #Do some processing on $file_data here # (Over)Write to same fileprint FILE $file_data; close(FILE);Thank you.
allen2 Posted March 24, 2011 Posted March 24, 2011 Of course "+>" is not right , if you open a file while deleting its content, you won't read too much data from it. The original purpose of this mode is to write first the in file after erasing its contend then read. For what you're triying to do "+<" should work.
CoffeeFiend Posted March 25, 2011 Posted March 25, 2011 I honestly don't understand the purpose of "+> operator". "+<" works fine, but it appends to the file. "+>" deletes the contents of the file immediately after I open it.+> truncates the file (makes it a 0 byte file) so it's hard to read from that for sure. It will also create the file if it doesn't exist already (either ways, you're getting that 0 byte file)+< is read/write. However, after you're done reading the file, your "position" is at the end of the file, so if you start writing then that's where you'll be writing from -- essentially appending (very much like it would using any other language in this specific scenario). If you want to write from the beginning, you have to seek to the beginning first.Not that I would do it this way, unless you're at least 100% certain that the content you'll be writing will never be smaller by *any* amount (even a single byte), because then you'll have garbage appended at the end of your new file. Your best bet (again, for any language -- so long as the files aren't huge) is to first open the file, reading its contents into some sort of variable, then closing it. Then you do whatever processing it is you wanted to do. Then you finally reopen it, this time for writing, *truncating* the old file, write the new stuff to it and close it once last. Or you can also rename the old file as a backup (if you want one), and create the new file. That's much more fool-proof in most cases.
poly4life Posted March 25, 2011 Author Posted March 25, 2011 Yes, thank you, I see that now, "+>" is not the right tool for the job. However, I was looking at overwrite, as opposed to append, because I didn't wish to append data to the end of the file. FYI -- I should've mentioned in this in the original post -- I am running Strawberry Perl 5.12.2.0 on Windows XP SP3 32-bit, and the file I'm opening is a text file and some of the processing involves regex.After processing, I rewrite the entire file, instead of just a portion of it. The problem is the much of the rewrite starts at the beginning of the file. Again, my logic was instead of picking-and-choosing which parts of the file to rewrite, why not just read the entire contents (it's not a huge file) of the text file into Perl, do processing on it, then rewrite the entire file.I looked at the seek doc and did some more research on append and seek, and at least from unix/linux, it is not possible to syseek or seek with append. I had tried it, too, and it would only append to the end of the file.SOURCE: http://www.justlinux.com/forum/showthread.php?t=131467MY CODE for read/append ("+>>")#!/usr/local/bin/perl -wuse Fcntl qw(:DEFAULT :flock :seek); # import LOCK_* constantsopen(FILE, "+>>", "test.txt") || die("Cannot open file");flock(FILE, LOCK_EX);seek(FILE, 0, SEEK_SET);$file_data=<FILE>;print $file_data;print FILE "xxx";close(FILE);print "\n\n-------\n\n";Beforehand, I opened the file in read-mode, copied its contents, closed the file, opened the file again in write-overwrite-mode, wrote to the file, and, finally, closed the file. But I thought why open and close the same file twice, when I may be able to do it all in one shot? It'll be more efficient, less code, making it potentially easier to maintain and debug, and less of a performance hit. With one file, it's no big deal. But if I'm processing many, many files (i.e. reading in a directory), I could see a performance hit. So, this is why I posted in the first place, to learn if there's a better way.I like the suggestion for the backup, thank you. I suppose I'll just open it twice, unless I can get the Tie::File module to work correctly.Thank you both again.
gunsmokingman Posted March 26, 2011 Posted March 26, 2011 I do not know if this will help or not but here is a VBS script 1\ Opens the textfile and read it contents into one varible called V12\ Then uses the V1 varible to rewrite the textfile and adds at Line 4 and Line 7, Add 1 , Add 2,then closes the textfile with the changes saved.Const ForReading = 1, ForWriting = 2, ForAppending = 8'-> Object For Script Dim Fso :Set Fso = CreateObject("Scripting.FileSystemObject")'-> Varibles For Use Dim C1, File, Ts, V1, V2'-> Check To Make Sure File Is Present File = "Test_Text.txt" If Fso.FileExists(File) Then '-> Loop To Read All The Text File Into Varible V1 Set Ts = Fso.OpenTextFile(File,ForReading,True) Do Until Ts.AtEndOfStream V1 = Ts.ReadAll Loop Ts.Close'-> Loop To Add The New Add 1, Add 2 At Lines 4 And & Set Ts = Fso.OpenTextFile(File,ForWriting,true) For Each V2 In Split(V1, vbCrLf) C1 = C1 + 1'-> Add To Line 4 And Line 7 If C1 = 4 Then Ts.WriteLine "Add 1 " & V2 ElseIf C1 = 7 Then Ts.WriteLine "Add 2 " & V2 ElseIf V2 = "" Then'-> Do Nothing It A Blank Line Else'-> Add The Unchange Line Back To File Ts.WriteLine V2 End If Next Ts.Close Else MsgBox "Missing This Text : " & File End If
Yzöwl Posted March 26, 2011 Posted March 26, 2011 Quick question gsm, would I need to add to line four and line six to add to lines four and line seven respectively.If I add to line four it would mean that old line four became new line five, old line five became new line six and old line six became new line seven! I'd suggest the term append to line
gunsmokingman Posted March 27, 2011 Posted March 27, 2011 If I add to line four it would mean that old line four became new line five, No the script only adds to the front of the line and line 4 remains line 4 after the change., V2 would be line 4 from the varible V1 after it had been Split with vbCrLf If C1 = 4 Then Ts.WriteLine "Add 1 " & V2Contents Of Test_Text.txt before script runs Line 01Line 02Line 03Line 04Line 05Line 06Line 07Line 08Line 09Line 10After script ran onceLine 01Line 02Line 03Add 1 Line 04Line 05Line 06Add 2 Line 07Line 08Line 09Line 10If you ran the script 2 times then it Line 01Line 02Line 03Add 1 Add 1 Line 04Line 05Line 06Add 2 Add 2 Line 07Line 08Line 09Line 10
CoffeeFiend Posted March 27, 2011 Posted March 27, 2011 would I need to add to line four and line six to add to lines four and line seven respectively.Doesn't really matter. The OP is doing something completely different in the first place like he explained in post #4 (using regular expressions). And then again, he already had a working solution that did the file open/read/close, processing, then file open/write/close separately. His only problem was opening the file just once with R/W access, reading from it, seeking back (which he wasn't doing so it was effectively appending) and then writing again -- which as-is was bad idea in the first place: if your new content is shorter than the old one, then you end up with junk (contents from the old file) tacked on at the end of it (I tried to explain before that he had to seek back, and that this often wouldn't be sufficient to solve the problem too)So changing language without any real technical merits or benefits, not using regular expressions, or adding specific logic that's completely irrelevant to the problem and such? Ok, whatever... But this doesn't actually address his actual problem in any way: reading from & writing to the same file by opening it just once. Not mentioned (because he hasn't discovered the next problem that would arise once he gets this working), but it must be able to "shrink" the file too if necessary.There is a way to do exactly what he's asking for (and in perl, still using regexp's and all), not that it really offers any actual benefits vs opening it twice IMO:open the file with RW access (using "+<")read its content into some variablestoring the size/length of the said "old" contents in a variabledo the processing on it just like beforeseek to the beginning of the file: seek(FILE, 0, SEEK_SET); (what he wasn't doing after reading from the file, thus making it append)write your new contentif the size of the "new" content is less than the size of the old contents previously stored, then call truncate(FILE, newSizeHere); on it (discarding the extraneous bytes)close the fileNot that it's any better than his current/old solution IMO. He's seemingly trying to do that for performance, but saving a file open/close operation (just getting a file handle) vs the added seek & truncate operations... There's basically going to be no measurable difference between the two (way less than 1ms difference*). I'll much sooner use the code that's more solid (proper error handling for starters), better written, better written/documented, easier to understand, more versatile/reusable, is better tested (e.g. has good unit tests), is easier to use, will be better supported in the future, etc.Either ways, I think this is completely pointless in the first place (and this is why I have not/will not bother spending the 5 minutes to write code that does exactly this). This particular problem (replacing text using regular expressions) was already solved 35 years ago by AWK (using sub or gsub). He's just reinventing the wheel, and poorly at that.* Edit: allen2 sure has a good point there too (see post below). I mean, if this executes once in a while it's pointless trying to spend hours of coding to shave off a few microseconds of execution tme. But if you're going to use this in a situation where it actually matters (like running it a billion times in a loop) then a scripting language probably isn't the best tool for the job in the first place (you'd want something compiled for sure -- and probably make the tool iterate through the files instead of running it a bazillion times). Then again, sometimes regular expressions are also overkill (or not the best pick) for the job and something like a Boyer-Moore search might be faster to find the parts that need replacing. I don't personally bother much with optimization (assuming the code is already half-decently written) until it actually becomes a problem (then you profile and see what needs to be optimized -- the file I/O, the time spent on string ops, the time spent spawning the same process repeatedly, etc -- and then address that particular problem)
allen2 Posted March 27, 2011 Posted March 27, 2011 I'll just add that if you want a fast execution speed, you most likely shouldn't use a scripting langage.
Recommended Posts
Please sign in to comment
You will be able to leave a comment after signing in
Sign In Now