• Welcome to The Cave of Dragonflies forums, where the smallest bugs live alongside the strongest dragons.

    Guests are not able to post messages or even read certain areas of the forums. Now, that's boring, don't you think? Registration, on the other hand, is simple, completely free of charge, and does not require you to give out any personal information at all. As soon as you register, you can take part in some of the happy fun things at the forums such as posting messages, voting in polls, sending private messages to people and being told that this is where we drink tea and eat cod.

    Of course I'm not forcing you to do anything if you don't want to, but seriously, what have you got to lose? Five seconds of your life?

BBCode Parser

Pikachu

Kelp is good! Yum yum!
Pronoun
he
I've decide to code my own BBCode parser in Perl:

Code:
#!/usr/bin/perl
use strict;
use CGI qw(:standard);

our $query = new CGI;
our $_ = $query->param('text');

print "Content-Type: text/html\n\n";

s@\[b\](.*?)\[/b\]@<b>$1</b>@ig;
s@\[i\](.*?)\[/i\]@<i>$1</i>@ig;
s@\[u\](.*?)\[/u\]@<u>$1</u>@ig;
s@\[big\](.*?)\[/img\]@<span style='font-size: large'>$1</span>@ig;
s@\[url=((?![^:]+script).*?)\](.*?)\[/url\]@<a href="$1">$2</a>@ig;
s@\[url\]((?![^:]+script).*?)\[/url\]@<a href="$1">$1</a>@ig;
s@\[img\]((?![^:]+script).*?)\[/img\]@<img src="$1" alt="[IMAGE]" />@ig;
s@\[center\](.*?)\[/center\]@<p style="text-align:center">$1</p>@ig;
s@\[left\](.*?)\[/left\]@<p style="text-align:left">$1</p>@ig;

print "<form action='' method='post'><textarea name='text'></textarea><input type='submit' value='submit' /></form>";
print $_;
It works fine but when translated to HTML, it isn't parsed correctly. For example, [b][i]bold[/b][/i] becomes <b><i>bold</b></i>. How would I write the script so that it parses correctly in HTML?
 
Why not do straight HTML parsing with a DOM? Regular expressions aren’t really suited for this sort of thing—formatting tags would probably be better interpreted with a stack.
 
Last edited:
It works fine but when translated to HTML, it isn't parsed correctly. For example, [b][i]bold[/b][/i] becomes <b><i>bold</b></i>. How would I write the script so that it parses correctly in HTML?

That's because you nested the tags wrong in the first place. If you open "b" before "i", you close out "i" before closing "b."

[b][i]foo[/i][/b] becomes <b><i>foo</i></b>, which is valid HTML.
 
Yeah, I know that. I want to know how to fix that using the script. It's for a guestbook.
 
Well, the way MyBB does that is that, instead of using "b" and "i" tags, it uses span tags like so

Code:
<span style="font-weight: bold;"><span style="font-style: italic;">foo</span></span>
That way, if the tags are improperly nested, it doesn't matter because they're both the same type of HTML tag. vBulletin actually rearranges the tags. I'm not sure how they do that because I don't have a vBulletin license.
 
Hmm, that work to an extent. What if they wanted a URL italicizes: [url=#][b]hi[/url][/b]

Thanks for your help, though! That made it a bit easier.
 
Back
Top Bottom