User Name
Password

Go Back   Planetarion Forums > Non Planetarion Discussions > Programming and Discussion

Reply
Thread Tools Display Modes
Unread 18 Jan 2006, 15:31   #1
Dante Hicks
Clerk
 
Join Date: Jun 2001
Posts: 13,940
Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.
[Perl] File / Array Comparisons

I have two files, for the sake of argument we'll call them "old.txt" and "nex.txt". It's a series of records, one per line, of variable length.

I want to compare both the files and then have a list of entries in a new file ("changes.txt").

So for the sake of example, my first file ("old.txt") is thus :
Code:
1. This is the first line of the file.
5. This is the FIFTH line of the file.
3. This is the third line of the file.
4. This is the fourth line of the file.
2. This is the second line of the file.
6. This is the seventh line of the file.
My second file ("new.txt") is thus :
Code:
1. This is the first line of the file.
2. This is the second line of the file.
3. This is the third line of the file.
4. This is the fourth line of the file.
5. This is the fitfh line of the file.
6. This is the sixth line of the file.
7. This is the seventh line of the file.
8. This is the eight line of the file.
I'm not interested in what order the lines appear in, so ultimately I'd want the output to look like :
Code:
5. This is the fitfh line of the file.
6. This is the sixth line of the file.
7. This is the seventh line of the file.
8. This is the eight line of the file.
Because they're the changes, i.e. records in "new.txt" that aren't in "old.txt". Any ideas?

Obviously I'm trying to read both files into respective arrays, and then comparing the elements one by one, deleting all duplicate records as I go, but my array handling in Perl is rusty to say the least so I keep getting nonsense output. This seems like a staggeringly easy task but I've been staring at this for an hour now with no progress.

Help plz.
Dante Hicks is offline   Reply With Quote
Unread 18 Jan 2006, 17:19   #2
queball
Ball
 
queball's Avatar
 
Join Date: Oct 2001
Posts: 4,410
queball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so little
Re: [Perl] File / Array Comparisons

Basically:
Code:
open OLD, "<old.txt" or die "$!";
open NEW, "<new.txt" or die "$!";

my @old = <OLD>;
my @new = <NEW>;

for my $line (@new) {
  unless (grep { $_ eq $line } @old) {
    @old = (@old, $line); # don't print it again
    print $line;
  }
}
(edit: changed to remove duplicates)

I'm not sure exactly what you meant by the line numbers though - perhaps you want to strip them. The problem is a little vague. You might want to count multiple lines which appear more in new.txt than in old.txt:

Code:
open OLD, "<old.txt" or die "$!";
open NEW, "<new.txt" or die "$!";

my %lines;

$lines{$_}-- while (<OLD>);
$lines{$_}++ while (<NEW>);

for (keys %lines) {
  print if ($lines{$_} > 0);
}
__________________
#linux

Last edited by queball; 18 Jan 2006 at 17:29.
queball is offline   Reply With Quote
Unread 18 Jan 2006, 17:33   #3
Dante Hicks
Clerk
 
Join Date: Jun 2001
Posts: 13,940
Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.
Re: [Perl] File / Array Comparisons

The line numbers don't really matter, my example wasn't particularly good, I just meant the file is generated by an external SQL database and I have no control over the order of the lines (if that makes sense).

Basically, ignore the line numbers point.

Thanks for the help, I will try your solution, I eventually ended up using hashes instead and then deleting them by key as I went with the comparisons. e.g.
Code:
foreach $new (@new_file) 	
{
	$Count++;
	$new_hash{ $Count } = $new; 
	foreach $old (@old_file) 
	{
		if ($new eq $old)
		{
			delete $new_hash{$Count};
		}
		
	}
}
Although I've not tried it on the proper files which are 27k lines long.

To give some context, we supply one of our contractors a list of all our properties/customers/etc every day. Because they're rubbish they find it difficult to deal with a big file, so they've asked for an exception report (i.e. they only get records which have changed since yesterday). We can't modify the SQL extract for reasons too tedious to go into, so I'm trying to play with the files so they get a smaller daily update.

Last edited by Dante Hicks; 18 Jan 2006 at 17:39.
Dante Hicks is offline   Reply With Quote
Unread 18 Jan 2006, 17:46   #4
queball
Ball
 
queball's Avatar
 
Join Date: Oct 2001
Posts: 4,410
queball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so little
Re: [Perl] File / Array Comparisons

Hashes aren't much slower than arrays, they're just less easy to iterate over etc. You can delete from an array too - it just means $array[val] will be undefined.
queball is offline   Reply With Quote
Unread 4 Feb 2006, 04:33   #5
Arachnidman
You love me really
 
Join Date: Aug 2005
Posts: 342
Arachnidman has a reputation beyond reputeArachnidman has a reputation beyond reputeArachnidman has a reputation beyond reputeArachnidman has a reputation beyond reputeArachnidman has a reputation beyond reputeArachnidman has a reputation beyond reputeArachnidman has a reputation beyond reputeArachnidman has a reputation beyond reputeArachnidman has a reputation beyond reputeArachnidman has a reputation beyond reputeArachnidman has a reputation beyond repute
Re: [Perl] File / Array Comparisons

man diff
Arachnidman is offline   Reply With Quote
Unread 4 Feb 2006, 12:00   #6
Dante Hicks
Clerk
 
Join Date: Jun 2001
Posts: 13,940
Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.Dante Hicks has ascended to a higher existance and no longer needs rep points to prove the size of his e-penis.
Re: [Perl] File / Array Comparisons

Quote:
Originally Posted by Arachnidman
man diff
diff doesn't output in the format that I want, unless I'm mistaken. And the files could theoretically be different but it could have the same rows/lines (which is what I'm interested in).
Dante Hicks is offline   Reply With Quote
Unread 9 Feb 2006, 15:11   #7
queball
Ball
 
queball's Avatar
 
Join Date: Oct 2001
Posts: 4,410
queball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so littlequeball contributes so much and asks for so little
Re: [Perl] File / Array Comparisons

You could still use sort and uniq to "canonicalise" the files so they can be compared. I don't trust diff though.
queball is offline   Reply With Quote
Reply


Thread Tools
Display Modes

Forum Jump


All times are GMT +1. The time now is 12:16.


Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2002 - 2018