Thursday, 12 September 2013

searching for non english word in a file python

searching for non english word in a file python

I am trying to solve "simple" problem in python (2.7). suppose that i have
two files:
key.txt - which have a key to search for. content.txt - which has a web
content (html file)
both files saved in utf-8. content.txt is mixed file, which means it
contains non english characters (web html file)
i am trying to check if the key in key.txt file found in the content or
not. tried comparing the files as binary (bytes) didn't work, also tried
decoding didn't work.
i would also appreciate any help on how can i search for regex which is
mixed (my pattern built from english and non-english characters)

No comments:

Post a Comment