regex and greedy behavior

Discussion in 'all things UNIX' started by vasa1, Dec 21, 2011.

Thread Status:
Not open for further replies.
  1. vasa1

    vasa1 Registered Member

    Joined:
    May 1, 2010
    Posts:
    4,152
    I'm trying to learn regex. In the examples below, I want to have a pattern that finds just the parts in bold in text that is present in Geany or in LibO's Writer. It seems that regex is <i>greedy</i> by default and the pattern <a href.*> picks up the entire examples (in both Geany and Writer).

    <A HREF="hxxp://www.w3schools.com/html/html_styles.asp">hxxp://www.w3schools.com/html/html_styles.asp</A></P>

    <A HREF="hxxp://www.w3schools.com/html/tryit.asp?filename=tryhtml_styles" TARGET="_blank"><FONT COLOR="#900b09"><FONT FACE="verdana, helvetica, arial, sans-serif"><FONT SIZE=2><SPAN STYLE="font-style: normal"><SPAN STYLE="font-weight: normal"><SPAN STYLE="background: transparent">

    <A HREF="hxxp://www.w3schools.com/css/default.asp"><FONT COLOR="#900b09"><FONT FACE="verdana, helvetica, arial, sans-serif"><FONT SIZE=2><SPAN STYLE="font-style: normal"><SPAN STYLE="font-weight: normal"><SPAN STYLE="background: transparent">

    <A HREF="hxxp://www.w3schools.com/html/tryit.asp?filename=tryhtml_bodybgstyle" TARGET="_blank"><FONT COLOR="#ffffff"><SPAN STYLE="text-decoration: none"><FONT FACE="verdana, helvetica, arial, sans-serif"><FONT SIZE=1 STYLE="font-size: 8pt"><SPAN STYLE="font-style: normal"><B><SPAN STYLE="background: #98bf21">

    Edit: <[^>]+> works.
     
    Last edited: Dec 22, 2011
  2. vasa1

    vasa1 Registered Member

    Joined:
    May 1, 2010
    Posts:
    4,152
Thread Status:
Not open for further replies.