Quick Parallel Calculations for All inclusive Lossless Source Coding.


88 views
Uploaded on:
Category: Medical / Health
Description
Quick Parallel Calculations for Widespread Lossless Source Coding. Dror Nobleman CSL and ECE Division, UIUC dbaron@uiuc.edu Ph.D. Safeguard – February 18, 2003. Outline. Inspiration, applications, and objectives Foundation: Source models Lossless source coding + all inclusiveness Semi-prescient techniques
Transcripts
Slide 1

Quick Parallel Algorithms for Universal Lossless Source Coding Dror Baron CSL & ECE Department, UIUC dbaron@uiuc.edu Ph.D. Safeguard – February 18, 2003

Slide 2

Overview Motivation, applications, and objectives Background: Source models Lossless source coding + all inclusiveness Semi-prescient routines An O(N) semi-prescient general encoder Two-section codes Rigorous investigation of their pressure quality Application to parallel pressure of Bernoulli successions Parallel semi-prescient (PSP) coding Achieving a work-productive calculation Theoretical results Summary

Slide 3

Motivation Lossless pressure : content records, copied, programming executables, medicinal, money related, and so forth. What do we need in a pressure calculation ? All inclusiveness: versatile to a substantial class of sources Good pressure quality Speed: low computational many-sided quality Simple usage Low memory use Sequential versus disconnected from the net

Slide 4

Why Parallel Compression ? A few applications oblige high information rates: Compressed pages in virtual memory Remote chronicling + quick correspondence interfaces Real-time pressure away frameworks Power lessening for interconnects on a circuit load up Serial pressure is restricted by the clock rate

Slide 5

Room for Improvement and Goals Previous Art : Serial all inclusive source coding routines have come to the limits on pressure quality [Willems1998,Rissanen1999] Parallel source coding calculations have high intricacy and/or poor pressure quality Naã¯ve parallelization packs ineffectively Parallel word reference pressure [Franszek et. al.1996] Parallel connection tree weighting [Stassen&Tjalkens2001,Willems2000] Research Goals : “good” parallel pressure calculation Work-effective : O(N/B) time with B computational units Compression quality : in the same class as best serial routines (just about!)

Slide 6

Main Contributions BWT-MDL ( O(N) general encoder): An O(N) calculation that accomplishes Rissanen’s excess limits on best achievable pressure Combines proficient prefix tree development with semi-prescient way to deal with widespread coding Fast Suffix Sorting (not in this discussion): Core calculation is exceptionally straightforward (can be executed in VLSI) Worst-case multifaceted nature O(N log 0.5 (N)) Competitive with other postfix sorting systems practically speaking Two-Part Codes : Rigorous investigation of their pressure quality Application to dispersed/parallel pressure Optimal two-section codes Parallel Compression Algorithm (not in this discussion): Work-productive O(N/B) calculation Compression misfortune is generally B log (N/B) bits

Slide 7

Source Models Binary letter set X={0,1} , arrangement x  X N Bernoulli Model: i.i.d. model p(x i =1)=  Order-K Markov Model: Previous K images called setting Context-subordinate restrictive likelihood for next image More adaptable than Bernoulli Exponentially numerous states

Slide 8

P(x n+1 =1|0)=0.8 0 leaf 0 01 11 P(x n+1 =1|01)=0.4 1 0 interior hub 1 P(x n+1 =1|11)=0.9 Context Tree Sources root More adaptable than Bernoulli More minimized than Markov Particularly useful for content Works for M - ary letter set State = connection + contingent probabilities Example: N=11 , x=01011111111

Slide 9

Review of Lossless Source Coding Stationary ergodic sources Entropy rate H= lim N  H(x)/N Asymptotically, H is the least feasible per-image rate Arithmetic coding: Probability task p(x) Coding length l(x)=-log (p(x))+O(1) Can accomplish entropy + O(1) bits

Slide 10

Universal Source Coding Source insights are obscure Need likelihood task p(x) Need to gauge source model Need to depict evaluated source (unequivocally or verifiably) Redundancy : overabundance coding length above entropy  (x)=l(x)- NH

Slide 11

Redundancy Bounds Rissanen’s bound ( K obscure parameters): E[ (x)] > (K/2) [ log (N)+O(1)] Worst-case excess for Bernoulli groupings (K=1):  (x * )= max x  X N { (x)}  0.5 log (  N/2) Asymptotically,  (x)/N  0 practically speaking, e.g., content, the quantity of parameters scales straightly with N Low repetition is still vital

Slide 12

S * Phase I Phase II y MDL Estimator x Encoder Semi-Predictive Approach Semi-prescient systems portray x in two stages: Phase I: discover a “good” tree source structure S and depict it utilizing codelength l S Phase II: encode x utilizing S with likelihood task p S (x) Phase I: assess least portrayal length (MDL) tree source model S*=arg min {l S – log (p S (x))}

Slide 13

Arithmetic Encoder Determine s p(x i |s) s x i y S * Assign p(x i |s) Semi-Predictive Approach - Phase II Sequential encoding of x given S * Determine which state s of S * created image x i Assign x i a restrictive likelihood p(x i |s) Arithmetic encoding p(x i |s) can be founded on already prepared segment of x , quantized likelihood gauges, and so forth

Slide 14

root hub 0 hub 11 hub 01 special sentinel Context Trees We will give an O(N) semi-prescient calculation by assessing S * utilizing connection trees Context trees mastermind x in a tree Each hub compares to sequence of annexed circular segment labels on way to root Internal hubs relate to rehashing connections in x Leaves relate to extraordinary settings Sentinel image x 0 =$ verifies symbols have distinctive settings

Slide 15

MDL structure for state s is or MDL structure for 1s extra structure 0s 1s 10s s 00s Context Tree Pruning (“To prune or not to prune…”) The MDL structure for state s yields the most brief portrayal for images produced by s When handling state s : Estimate MDL structures for states 0s and 1s Decide whether to keep 0s and 1s or prune them into state s Base choice on coding lengths

Slide 16

1 0 Phase I with Atomic Context Trees Atomic setting tree : Arc names are nuclear (single image) Internal hubs are not so much fanning Has up to O(N 2 ) hubs The coding length minimization of Phase I forms every hub of the setting tree [Nohre94] With nuclear connection trees, the most pessimistic scenario many-sided quality is at any rate O(N 2 ) ☹

Slide 17

111 1 0 Compact Context Trees Compact connection tree : Arc names not so much nuclear Internal hub are stretching O(N) hubs Compact representation of the same tree Depth-first traversal of reduced connection tree gives O(N) multifaceted nature  Theorem : Phase I of BWT-MDL obliges O(N) operations performed with O( log (N)) bits of accuracy

Slide 18

Arithmetic Encoder Determine s p(x i |s) s x i y S * Assign p(x i |s) Phase II of BWT-MDL We focus the generator state utilizing a novel calculation that depends on properties of the Burrows Wheeler change (BWT) Theorem : The BWT-MDL encoder obliges O(N) operations performed with O( log (N)) bits of exactness Theorem : [Willems et. al. 2000]: excess w.r.t. any tree source S is at most |S|[0.5 log (N)+O(1)] bits

Slide 19

Assign p(x i (1)) Arithmetic Encoder 1 Arithmetic Encoder B Assign p(x i (B)) x i (1) p(x i (1)) x(1) y(1) Encoder 1 … x Splitter x i (B) p(x i (B)) x(B) y(B) Encoder B Distributed/Parallel Compression of Bernoulli Sequences Splitter parcels x into B squares x(1),…,x(B) Encoder j {1,…,B} packs x(j) ; it allots probabilities p(x i (j)=1)=  and p(x i (j)=0)=1- The aggregate likelihood relegated to x is indistinguishable to that in a serial pressure framework This structure expect that  is known; we will likely give a widespread parallel pressure calculation for Bernoulli successions

Slide 20

Quantizer y k {1,…,K} r k  ML (x) x Determine  ML (x) Encoder Two-Part Codes Two-section codes utilize a semi-prescient way to deal with depict Bernoulli groupings: First piece of code: Determine the greatest probability (ML) parameter gauge  ML (x)= n 1/( n 0 + n 1 ) Quantize  ML (x) to r k , one of K representation levels Describe the canister record k with log (K) bits Second piece of code encodes x utilizing r k As a part of appropriated frameworks: Sequential compressors oblige O(N) inside interchanges Two-section codes require just convey { n 0 (j),n 1 (j) } j{1,…,B} Requires O(B log (K)) interior correspondences

Slide 21

r 1 r 2 r k r K b k b K b 0 b 1 b 2 b k-1 Jeffreys Two-Part Code Quantize  ML (x) Bin edges b k = sin 2 (k/2K) Representation levels r k = sin 2 ((2k-1)/4K) Use K  1.772N 0.5  containers Source portrayal: log (K) bits for portraying the receptacle list k Need – n 1 log(  ML (x) ) - n 0 log(1- ML (x) ) for encoding x

Slide 22

Redundancy of Jeffreys Code for Bernoulli Sequences Redundancy: log (K) bits for depicting k N D(  ML (x)||r k ) bits for encoding x utilizing loose model D(a||b) is Kullback Leibler disparity In canister k , l(x) =- ML (x) log ( r k )- [1- ML (x)] log (1-r k ) l( ML (x)) is poly-line Redundancy = log (K) + l( ML (x)) – N H( ML (x))  log (K) + L  Use quantizers that have little L  separation between the entropy capacity and the impelled poly-line fit

Slide 23

Redundancy Properties For x s.t.  ML (x) is quantized to r k , the most pessimistic scenario repetition is log (K)+N max {D(b k ||r k ),D(b k-1 ||r k )} D(b k ||r k ) and D(b k-1 ||r k ) Largest in introductory or end containers Similar in the center canisters Difference diminished over more extensive scope of k for bigger N ( bigger K) C a build a close ideal quantizer by adjusting the starting and end receptacles of the Jeffreys quantizer

Slide 24

Redundanc

Recommended
View more...