rfc2html – php script to view rfc with index and links.

If you have to keep viewing RFC’s and you miss index and links in RFC while viewing rfc, then you should check-out rfc2html. It is scrtip that takes plain text rfc and converts it to html.

You can get the original code from sourceforge.

However, I found some small issues with the script and have sent a mail to the authour about the same. In the meantime, you can use the diff below to fix the issue’s or download this diff file rfc2html.diff and apply the diff:

--- rfc2html.php	2014-06-27 18:42:14.027210656 +0530
+++ new/rfc2html.php	2014-07-06 12:06:23.212308365 +0530
@@ -19,7 +19,7 @@
- * @version $Id: rfc2html.php,v 1.9 2006/02/08 21:44:42 chmate Exp $
+ * @version $Id: rfc2html.php 15 2006-02-22 08:52:04Z chmate $
  * @author Chang Hsiou-Ming <chmate@gmail.com>
@@ -35,8 +35,8 @@
 define("PAGE_COLUMNS", 72);
 define("BUF_SIZE", 8192);
 define("CENTRAL_ERROR", 4);
-define("REF_PATTERN", '/\[RFC(\d+)\]/');
-define("REF_REPLACE", '<a class="ref" href="rfc2html.php?in=\1">\0</a>');
+define("REF_PATTERN", '/\[(\w*\d+)\]/');
+define("REF_REPLACE", '<a class="ref" href="#REF\1">\0</a>');
 define("REFED_REPLACE", '<a name="REF\1">\0</a>');
 define("SEC_NUMBER", '/^(\d+(\.(\d|\w)+)*)(\s|\.)/');
 define("SEC_PATTERN", '/((section|sec)\s*(\d+(\.\d+)*))/i');
@@ -235,7 +235,6 @@
 		echo "</div><!-- page -->\n";
-	//echo '<pre>'; var_dump($rfc_toc); echo '</pre>';	
 	$toc = build_toc($toc);
 	echo "</div><!-- pages -->\n";
@@ -639,12 +638,11 @@
 #sidebar {
-	position: fixed;
-	top: 5px;
-	left: 1px; 
+	position: absolute;
+	top: 50px;
+	left: 10px;
 	width: 280px;
 	margin: 0;
-    font-size:10px;
 #navbar {
@@ -727,10 +725,8 @@
 div.toolbar {
-    position: fixed;
 	background: #e0e0e0;
-    width: 100%;
-	margin: 10;
+	margin: 0;
 	padding: 10px 2em 10px 1em;
 	border: 2px dashed #bbbbbb;
@@ -812,7 +808,6 @@
 	<?php @include 'rfc2html_head.php'; ?>	
 <div class="toolbar">
 	<form method="get" action="rfc2html.php">
@@ -821,7 +816,7 @@
 			<input type="submit" value="Go!" />
-</div> -->
 	if($text) {

get the contents of whole site like some wiki or wikia

For wikis and wikia, generally if you are trying to get some url mirror, then websucker.py is an excellent option. This script is in the python sources so, to get this tool,

yumdownloader --source python

Install the rpm downloaded in current directory and then go to ~/rpmbuild/SOUURCES.  You should find a Python-*.tar.xz file here, just extract with

tar xvf Python*.tar.xz

and there you go, you should find the tool in Tools/webchecker/websucker.py.

Enhanced by Zemanta

get all the urls in html file (local or on server).

To use this, you will need the lynx tool, so install that first.

sudo yum install lynx

Now, to get list of all the URLs in local html file or some URL, just execute this:

lynx -dump -listonly


Enhanced by Zemanta