Home

On Sep 29, 8:32 pm, hall.j...@gmail.com wrote:
> It think he's saying it should look like this:
>
> # File: masseditor.py
>
> import re
> import os
> import time
>
> p1= re.compile('(href=|HREF=)+(.*)(#)+(.*)(\w\'\?-<:)+(.*)(">)+')
> p2= re.compile('(name=")+(.*)(\w\'\?-<:)+(.*)(">)+')
> p100= re.compile('(a name=)+(.*)(-)+(.*)(></a>)+')
> q1= r"\1\2\3\4_\6\7"
> q2= r"\1\2_\4\5"
>
> def massreplace():
> editfile = open("C:\Program Files\Credit Risk Management\Masseditor
> \editfile.txt")
> filestring = editfile.read()
> filelist = filestring.splitlines()
>
> for i in range(len(filelist)):
> source = open(filelist[i])
> starttext = source.read()
>
> for i in range (13):
> interimtext = p1.sub(q1, starttext)
> interimtext= p2.sub(q2, interimtext)
> interimtext= p100.sub(q2, interimtext)
> source.close()
> source = open(filelist[i],"w")
> source.write(finaltext)
> source.close()
>
> massreplace()
>
> I'll try that and see how it works...

Ok, if you want a single RE... How about:


test = '''
<a href="Web_Sites.htm#A Web Sites">
<a name="A Web Sites"></a>
<a
href="Web_Sites.htm#A Web Sites">
<a
name="A Web Sites"></a>
<a HREF="Web_Sites.htm#A Web Sites">
<a name=Quoteless></a>
<a name = "oo ps"></a>
'''

import re

r = re.compile(r'''
(?:href=['"][^#]+[#]([^"']+)["'])
| (?:name=['"]?([^'">]+))
''', re.IGNORECASE | re.MULTILINE | re.DOTALL | re.VERBOSE)

def zap_space(m):
return m.group(0).replace(' ', '_')

print r.sub(zap_space, test)

It prints out

<a href="Web_Sites.htm#A_Web_Sites">
<a name="A_Web_Sites"></a>
<a
href="Web_Sites.htm#A_Web_Sites">
<a
name="A_Web_Sites"></a>
<a HREF="Web_Sites.htm#A_Web_____________________________Sites ">
<a name=Quoteless></a>
<a name = "oo ps"></a>

-- bjorn

previous
next

Re: SQLite and coercing to Unicode - please help.
STL::Accessing the iterator for a "passed" container type
Re: smart ptrs in vectors
Re: Undefined reference to pow
Re: Where we need to use Python ?
Akogo
Fundacja Sloneczko
Fundacja Hobbit
Podaruj Zycie
Mam Marzenie