#include <webnode.h>
Inheritance diagram for WebNode:
Public Methods | |
WebNode (uint32 idno) | |
void | InsertRawLinks (RawLinkSet *s) |
Inserts a set of tolinks into the WebNode. More... | |
void | NormalizeRawLinks (SimpleHashTable< WebNodePtr > *h) |
Sorts all valid links into the first part of the tolinks array. More... | |
size_t | RealSize () |
Returns the full size of the WebNode including link arrays. More... | |
int | NumberOfValidToLinks () |
int | NumberOfDanglingToLinks () |
int | NumberOfValidFromLinks () |
int | NumberOfLeafLinks () |
void | IncrementNumberOfFromLinks () |
void | AppendFromLink (WebNodePtr anothernode) throw (overflow_error) |
This appends a webnode to the fromlinks list. More... | |
void | UpdateLeafLinks (SimpleLeafNodePtrHashTable *leaftable) |
sifts LeafNode pointers upward in the tolinks array. More... | |
void | SetDate (uint16 adate) |
Sets the earliest known date of the WebNode. More... | |
WebNodePtr | ValidToLink (int k) |
WebNodePtr | ValidFromLink (int k) |
LeafNodePtr | ValidLeafLink (int k) |
LeafNodePtr | ValidLeafLinkDirectly (int k) |
uint32 | ID () |
uint16 | Date () |
void | ClearTag () |
void | SetTag (int k) |
bool | Tagged (int k) |
void | ClearOccupationCount () |
uint32 | OccupationCount () |
void | IncrementOccupationCount () |
void | IncrementOccupationCount (int c) |
ScratchStruct | Scratch () |
void | SetScratch (ScratchStruct ascratch) |
Static Private Attributes | |
MemPool< LinkStruct > | global_link_pool |
Every web document read by the ripper is represented by a WebNode. The construction of a WebNode is complicated, and is done by GraphBuilder, which also links the nodes into a WebLinkGraph. All the data members are defined as a WebNodeStruct, WebNode is really just a wrapper for WebNodeStruct to handle custom memory management. The class inherits memory management from MemoryPooled<T>.
Definition at line 92 of file webnode.h.
|
Definition at line 39 of file webnode.cc. References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::date, WebNodeStruct::fromlinks, WebNodeStruct::id, WebNodeStruct::num_fromlinks, WebNodeStruct::num_leaflinks, WebNodeStruct::num_tolinks, WebNodeStruct::num_valid_tolinks, WebNodeStruct::tolinks, and uint32. |
|
This appends a webnode to the fromlinks list.
Definition at line 99 of file webnode.cc. References MemPool< LinkStruct >::Allocate(), MemPoolObject< S >::data, and global_link_pool. |
|
Definition at line 160 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::occupation_count, and WebNodeStruct::scratch. |
|
Definition at line 148 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::tag. |
|
Definition at line 145 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::date, and uint16. Referenced by GraphBuilder::NodeGetDate(), GraphBuilder::NodeLaunch(), and DateBiasedPageRankSampler::QEvolveFrom(). |
|
Definition at line 143 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::id, and uint32. Referenced by Talker::LoadLeaves(), and GraphBuilder::NodeGetID(). |
|
Definition at line 109 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::num_fromlinks. |
|
Definition at line 169 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::occupation_count. |
|
Definition at line 167 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::occupation_count. Referenced by WebSampler::SimulateAllocForward(), and WebSampler::TaggedSimulateForward(). |
|
Inserts a set of tolinks into the WebNode. Inserts (or merges) the raw (character ptr) links into the webnode's tolinks array. Definition at line 70 of file webnode.cc. References MemPool< LinkStruct >::Allocate(), MemPoolObject< LinkStruct >::data, MemoryPooled< WebNodeStruct >::data, MemPool< LinkStruct >::Deallocate(), global_link_pool, WebNodeStruct::num_leaflinks, WebNodeStruct::num_tolinks, WebNodeStruct::num_valid_tolinks, LinkStruct::pointer_diff, RawLinkSet, and WebNodeStruct::tolinks. Referenced by GraphBuilder::NodeInsertLinks(). |
|
Sorts all valid links into the first part of the tolinks array. While sorting, it also converts the pointer_differences into webnode_ptrs. Dangling links are left in the upper half of the array. Note second argument should really be a data member of first (more elegant) but then we'd have to create a derived simplehashtable... Definition at line 137 of file webnode.cc. References MemPoolObject< LinkStruct >::data, MemoryPooled< WebNodeStruct >::data, SimpleHashTable< R >::Find(), WebNodeStruct::num_tolinks, WebNodeStruct::num_valid_tolinks, NumberOfValidToLinks(), OccupationCount(), LinkStruct::pointer_diff, WebNodeStruct::tolinks, ValidToLink(), and LinkStruct::webnode_ptr. |
|
Definition at line 102 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::num_leaflinks, WebNodeStruct::num_tolinks, and WebNodeStruct::num_valid_tolinks. |
|
Definition at line 106 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::num_leaflinks. Referenced by DateBiasedPageRankSampler::QEvolveFrom(), PageRankSampler::QEvolveFrom(), and UpdateLeafLinks(). |
|
Definition at line 104 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::num_fromlinks. Referenced by WebLinkGraph::BuildFromSets(), and TruncatedKleinbergSampler::QEvolveFrom(). |
|
Definition at line 100 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::num_valid_tolinks. Referenced by NormalizeRawLinks(), TruncatedKleinbergSampler::QEvolveFrom(), DateBiasedPageRankSampler::QEvolveFrom(), PageRankSampler::QEvolveFrom(), WebSampler::SimulateAllocForward(), WebSampler::TaggedSimulateForward(), and UpdateLeafLinks(). |
|
Definition at line 165 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::occupation_count, and uint32. Referenced by NormalizeRawLinks(), and UpdateLeafLinks(). |
|
Returns the full size of the WebNode including link arrays.
Definition at line 175 of file webnode.cc. References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::num_fromlinks, and WebNodeStruct::num_tolinks. |
|
Definition at line 172 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::scratch, and ScratchStruct. Referenced by DateBiasedPageRankSampler::QEvolveFrom(). |
|
Sets the earliest known date of the WebNode. This function is designed to be called several times. The earliest nonzero date is retained. Definition at line 55 of file webnode.cc. References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::date, and uint16. Referenced by GraphBuilder::NodeSetDate(). |
|
Definition at line 174 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::scratch, and ScratchStruct. Referenced by DateBiasedPageRankSampler::QEvolveFrom(). |
|
Definition at line 150 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::tag, and TAG_NUMBER_OF_BITS. Referenced by WebLinkGraph::BuildFromSets(), and Talker::BuildTags(). |
|
Definition at line 155 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::tag. Referenced by WebLinkGraph::BuildFromSets(), and WebSampler::TaggedSimulateForward(). |
|
sifts LeafNode pointers upward in the tolinks array. The code for this function is very similar to that of NormalizeRawLinks() Definition at line 185 of file webnode.cc. References MemPoolObject< LinkStruct >::data, MemoryPooled< WebNodeStruct >::data, SimpleHashTable< LeafNodePtr >::Find(), LinkStruct::leafnode_ptr, WebNodeStruct::num_leaflinks, WebNodeStruct::num_tolinks, WebNodeStruct::num_valid_tolinks, NumberOfLeafLinks(), NumberOfValidToLinks(), OccupationCount(), LinkStruct::pointer_diff, WebNodeStruct::tolinks, and ValidLeafLink(). |
|
Definition at line 124 of file webnode.h. References MemPoolObject< LinkStruct >::data, MemoryPooled< WebNodeStruct >::data, WebNodeStruct::fromlinks, and LinkStruct::webnode_ptr. Referenced by WebLinkGraph::BuildFromSets(), and TruncatedKleinbergSampler::QEvolveFrom(). |
|
Definition at line 129 of file webnode.h. References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::num_leaflinks, WebNodeStruct::num_valid_tolinks, and WebNodeStruct::tolinks. Referenced by DateBiasedPageRankSampler::QEvolveFrom(), and UpdateLeafLinks(). |
|
Definition at line 135 of file webnode.h. References MemPoolObject< LinkStruct >::data, MemoryPooled< WebNodeStruct >::data, LinkStruct::leafnode_ptr, WebNodeStruct::num_leaflinks, WebNodeStruct::num_valid_tolinks, and WebNodeStruct::tolinks. Referenced by DateBiasedPageRankSampler::QEvolveFrom(), and PageRankSampler::QEvolveFrom(). |
|
Definition at line 118 of file webnode.h. References MemPoolObject< LinkStruct >::data, MemoryPooled< WebNodeStruct >::data, WebNodeStruct::num_valid_tolinks, WebNodeStruct::tolinks, and LinkStruct::webnode_ptr. Referenced by NormalizeRawLinks(), TruncatedKleinbergSampler::QEvolveFrom(), DateBiasedPageRankSampler::QEvolveFrom(), and PageRankSampler::QEvolveFrom(). |
|
Definition at line 31 of file webnode.cc. Referenced by AppendFromLink(), and InsertRawLinks(). |