The CERN PAW Program

and "ntuples"

ntuples?
Sample ntuples: Omega and Cascade

This tutorial is relatively brief. Another useful tutorial is available from CERN.

What is an "ntuple"?

Events from the experiment are reconstructed, filtered and eventually written to an "ntuple" file. This file contains an entry for each interpretation of an event (there may be several for a single event!) The entry contains a few words real variables) containing pertinent quantities for this interpretation. This is called an "ntuple."

Different ntuples could be written for each event, each containing a different set of quantities. An ID number is associated with each ntuple.

Example: The Omega ntuple

The following quantities are stored for each interpretation of an event as an Omega. (NOTE that before writing this file, a vast number of events with no conceivable interpretation as Omega had already been rejected.) The ntuple records are in a file on the UNIX network:

/home/hepdisk3/briandata/omega.paw

ln -s /home/hepdisk3/briandata/omega.paw omega_data

The 17 Most Important Quantities in Omega Ntuple with ID number 2000
Variable #	Name	What it is.
1	NUMOM	Number of interpretations for this topology for this event.
2	CHGO	Charge of Omega. (-1. for Omega, +1. for anti-Omega)
3	CatK	Category for "K" track. If >1000 track does not appear in the SMD's
6	Qual	dtheta² + dphi². (radians²) Angular agreement between Lambda+K momentum and direction of Omega^- track in the SMD's
7	KCpi	Cherenkov probability for "K" track to be a pion
8	KCK	Cherenkov probability for "K" track to be a K
9	Lmz	z position of Lambda decay vertex.
10	Prmz	z position of primary (production) vertex.
13	Omz	z position of Omega^- decay vertex.
15	LM	Proton(1)+pi(2) effective mass - sqrt of (E1+E2)²-\|p1+p2\|²
16	XM	Lambda(1)+pi(2) effective mass - sqrt of (E1+E2)²-\|p1+p2\|²
17	OM	Lambda(1)+K(2) effective mass - sqrt of (E1+E2)²-\|p1+p2\|²
19	POM	\|p\| of Omega^- (GeV/c)
26	DIPO	Distance of closest approach of SMD Omega^- track to production vertex.
27	DIPK	Distance of closest approach of K^- track to production vertex.
14	ISMD	Positive if Omega decays downstream from SMD's; Negative if Omega decays within SMD's.
28	Nprm	Number of tracks at primary vertex

Another Example - the Cascade ntuple with ID=3000

A further ntuple available for this experiment contains entries of event interpretations as Cascades. These events differ from the Omega's in that Omega^- decays to Lambda and K^- while Cascade^- decays to Lambda and pi^-. This ntuple is in the area

/home/hepdisk3/briandata/cascade.paw

To access this enter the following unix command:

ln -s /home/hepdisk3/briandata/omega.paw xi_data

(alias $DATA/cascade.paw if you are logged in as "advlab1".) The table of contents is similar to that for Omegas except that the names of the variables often contain an "X" rather than an "O".

Tutorial Using the Omega paw file

NOTE - In the following examples of PAW commands:
PAW does not distinguish upper and lower case

Commands may exceed the screen's line width. They automatically wrap onto next line.

In addition to the Exercises below, devise some of your own.

HELP <command> will display a file on your screen in an editor. To read the file, these commands are used:

It is recommended you leave the CURSOR on the bottom (command) line of the HELP screen. "Enter" will place it there.
To EXIT the HELP screen	F3, <Esc> 3, or qq<Return>
To go to next page	F7 or <Esc> 7
To go to next page	F8 or <Esc> 8

Login to the UNIX network as advlab1 using the password provided and go to the directory where the ntuple is:

login: advlab1
password: abcxyz?

Run the paw program:

paw <Enter> then another <Enter>.
A graphic window should open and a prompt "PAW > " should appear in your command window. If not, ask Meadows.

Open the ntuple file for input:

PAW > h/file 1 omega_data

Print out the contents of the ntuple (you will see all that is in the table above and more):

PAW > n/print 2000

Plot the Lambda K- effective mass for Omega- candidates:

PAW > n/pl 2000.OM CHGO<0

Plot anti-Lambda K+ effective mass for anti-Omega candidates.

PAW > n/pl 2000.OM CHGO>0

Find out more about "n/pl"

PAW > help n/pl

Make a histogram with 50 bins starting at 1.65 and ending at 1.70

PAW > 1d 100 'Effective Mass of Omega Decay Products' 40 1.65 1.70

Now plot the Lambda K effective mass in this histogram:

PAW > n/pl 2000.om chgo<0 -100

Plot the anti-Omega (dashed) superimposed on this histogram.

PAW > n/pl 2000.om chgo>0 ! ! ! s

Make a printout of these plots:

PAW > exec ops ! This runs a macro file "ops.kumac" which will make all future plots become
! appended to a disk file "last.ps"
PAW > n/pl 2000.om chgo<0 -100 ! Now, in addition to appearing on the screen, this plot gets added to "last.ps"
PAW > n/pl 2000.om chgo>0 ! ! ! s ! Superimpose the anti-omega plot
PAW > exec cps ! This runs another macro file "cps.kumac" which closes "last.ps" and turns off

PAW > sh lpr -Pduplex ~/kumac/last.ps ! This executes the shell command to print "last.ps" (a local file in the
! directory ~/kumac) on printer duplex (in room 400)
! Other printer include 'rm400' (single side printing) 'rm439', etc..

Change the appearance of this plot:

PAW > n/pl 2000.om chgo>0 -100
set htyp 2
PAW > n/pl 2000.om chgo>0 ! ! s

Now plot these two histograms side by side:

zone 2 1 ;(first number is number of columns, second is number of rows.)
PAW > n/pl 2000.om chgo>0 -100
PAW > n/pl 2000.om chgo>0 ! ! s

Repeat, but make them appear one above the other:

zone 1 2
PAW > n/pl 2000.om chgo>0 -100
PAW > n/pl 2000.om chgo>0 ! ! ! s

Exercise 1:

Attempt to make a set of cuts on the following quantities which will reduce the background without significantly reducing the signal. (Some signal loss is inevitable, but hold it to less than 10% if you can while reducing the background by at least a factor two.)

CATK

DIPO

KCK

NUMOM

Exercise 2:

Now look at the Cascade ntuple.

Open the cascade file and close the omega one:

PAW > close 1 ! Close the Omega file

PAW > h/file 1 xi_data ! and open the cascade one.

Make a histogram ID 300 with 90 bins starting at 1.300 ending at 1.345 and plot Lambda+pi effective mass ("XM") for the two disjoint samples:

ISMD>0 (ie cascade decays downstream from the SMD's) and

ISMD<0 (ie cascade decays before end of SMD's).

Make these plots side by side on the screen.

Q: What do you notice? Does this make sense?

Repeat the above, but in each histogram superimpose also the subsamples with CATK>1000.

Q: What do you notice? Does this make sense?

Fitting

What a fit is:

PAW is able to fit histograms with any function f(x;a) where x is the abscissa of the histogram and a is a set of parameters which are required in the definition of f.

If the quantity x is histogrammed and f(x;a) is the form it should take [ie the number of entries y(x) in the bin from x to x+delta_x should be F(x; a) = Integral f(x;a)dx from x to x+delta_x ] then a "fit" will adjust all the a's so as to minimize the quantity:

Chi² = (y-F)²/sigma_y²

where sigma_y is the uncertainty on y(x). (PAW will take this to be the square root of y(x) - the uncertainty expected for this number of entries from Poisson statistics.)

A variation of this ("minimum Chi squared") method involves maximizing the quantity:

Log_Likelihood = y(x)*Log[f(x;a)]

This is the "Log-Likelihood" method. This works if the function is defined in such a way that its integral over the x range of the histogram does not depend upon the parameters a.

What kinds of fits PAW is capable of:

PAW is equipped to use either minimum chi squared or maximum likelihood methods to make fits (see the HELP file on h/fit). The form for f(x;a ) can be taken from a choice of three standard forms:

polynomial in x;

exponential in x or

Gaussian in x

with parameters a (see the example below) to define the precise form. In this example, a further feature of PAW - its ability to combine the first two of the above forms - is exploited.

PAW can also fit a user defined function. This requires that you write a FORTRAN function and save it as a file. An example of this can be seen in the file in

~brian/E791/Cascade/xeval.kumac

where a number of PAW commands are stored as well as the FORTRAN function to define a Breit-Wigner peak shape upon a polynomial background. This is a more advanced application of PAW.

Look for further HELP on this subject in PAW as follows:

PAW > help h/fit

If you have any problem, type help and follow "Histogram" (4) and "Fit" (9)

Make a fit to a Polynomial background of order 2 plus a Gaussian:

a(1)+a(2)*x+a(3)*x**2 + a(4)*exp(-0.5*((x-a(5))/a(6))**2) ....(1) Clearly this requires 6 parameters and here is how you define them:

PAW > vector/create a(6) R 0 0 0 0 1.672 0.004

This creates a REAL array "a" dimension 6 with data 1.0, 1.0, etc. (i.e . a(1)=a(2)=a(3)=a(4)=1.0, a(5)=1.672, a(6)=0.004). The first 4 values are not important to PAW as it will easily fit the polynomial parameters and the fraction of Gaussian on top. The last two numbers are the Mass and Resolution for the Omega peak respectively, and it is necessary to give PAW a good start on these values.

Now do the FIT:

PAW > h/fit 100 P2+G ! 6 a

That is all there is to it!

Exercise 3:

Find the Mass and resolution for Cascade minus and compare them for anti-Cascade.

Repeat for cleaner Omega and anti Omega samples using the "cuts":

Try the "disjoint sample" having

Q: What do you see? Explain it in terms of the spectrometer layout and where the SMD's are.

Using the Results of a Fit

In the above example, the results of the fit are stored in the PAW vector "a". Suppose you want to compute the area under the peak (as opposed to those in the background part.) From the context of equation (1) above we see that a(5) is the mean (or peak) value for the Gaussian and a(6) is the standard deviation. The area under the Gaussian is a(4)*a(6)*sqrt(2*pi). To obtain this number:

PAW > mess $sigma(a(6)*a(4)*sqrt(2*pi))

In this command the "mess" instruction asks PAW to print a message. The $sigma() part tells a subsystem (named "sigma") to evaluate the expression within the parentheses. To obtain more information on "sigma" type

PAW > help sigma ! (or look at the CERN tutorial.)

When you do as above, you will be surprised to find that the result is much smaller than you would expect from looking at your plot. The "mess" command will result in a number like 7 while it appears from the plot as if there are thousands of events in the peak. The problem is that to obtain the number of events, the curve needs to be divided by the bin size of the histogram. (Why is this?)

So - you will need the binsize for histogram ID 100. PAW is capable of providing information about any plot. Various functions "$HINFO" give this information and you will find these with

PAW > help k/functions

You will see a variety of functions, but the one you really need - binsize - is not directly available. You can however compute it from:

PAW > mess $sigma(($hinfo(100,'xmax')-$hinfo(100,'xmin'))/$hinfo(100,'xbins'))

(Note that PAW is not case sensitive like unix. Also note the balanced parentheses.)
Now see what the fit you made says of the number of events in the peak:

PAW > mess $sigma(a(4)*a(6)*sqrt(2*pi))*$hinfo(100,'xbins')/($hinfo(100,'xmax')-$hinfo(100,'xmin'))

(This command should all be on one line.) You will probably obtain a result like
7.072*50/(1.65-1.70)
and this is not quite what you want! So try

PAW > mess $eval($sigma(a(4)*a(6)*sqrt(2*pi))*$hinfo(100,'xbins')/($hinfo(100,'xmax')-$hinfo(100,'xmin')))

Finally, the result will now be the number of events in the peak!

More Complicated Example:

Suppose you want to compute the amount of background in the plot. Clearly this is the integral from low to high abscissae of the second order polynomial.

PAW is a bit clumsy, so it will be convenient for various reasons to make use of another vector "c" as follows:

PAW > v/cre c(3) r

PAW > v/input c(1) $hinfo(100,'xmin')

PAW > v/input c(2) $hinfo(100,'xmax')

PAW > v/input c(3) $hinfo(100,'xbins')

Make sure you have set up the plot and made the fit:

PAW > n/pl 2000.om chgo<0 -100

PAW > v/cre b(6) r 0 0 0 0 1.672 0.004

PAW > h/fit 100 $KUMAC/bwfit.for ! 6 b

Now do the background calculation:

PAW > v/input c(4) $sigma(b(1) + 1/2*b(2)*(c(2)+c(1)) + 1/3*b(3)*(c(2)**2+3*c(1)*c(2)+c(1)**2))

PAW > mess $eval(c(3)*c(4))

Exercise 4: (Write this up)

Find the asymmetry between production of Omega and anti Omega.

Use as clean a sample (with the same cuts in both cases!) as possible.

Find the asymmetry between production of Cascade and anti Cascade:

Once again, use as clean a sample (with the same cuts in both cases!) as possible.

Comment on the two results.

Use the errors from the fit to estimate the uncertainties in these asymmetries.

Investigate the effects of the cuts on the uncertainties. Define three sets of cuts:

Little or no cutting (big signal and big background)
Good cuts like those you found above (only slightly reduced signal and greatly reduced background)
Exaggerated cuts (signal reduced more than necessary, but background also cut more.)

More Complicated Fits and examples - FORTRAN.

Sometimes it is necessary to make a complex calculation. Perhaps a fit requires a function different from those provided by the "h/fit" command. For example we may wish to fit a Breit-Wigner peak rather than a Gaussian to the Omega- in histogram ID 100 of the example above.

This can be done using a FORTRAN subroutine "BWFIT(M)" defined in a file "$KUMAC/bwfit.for" on your disk. To use this to make the required fit:

First define the parameters with starting values. These must be REAL and correspond in order defined to those in the function to be fit (see the file below.)

PAW > v/cre b(6) r 0 0 0 0 1.672 0.004

Now make the fit

PAW > h/fit 100 $KUMAC/bwfit.for ! 6 b

The vector "b" will now contain optimised values for the parameters. As you will now see, b(5) is the Breit-Wigner peak mass and b(6) is the "natural width" - the full width at half maximum of the peak shape.

The file contains the FORTRAN code

      FUNCTION BWFIT(M)
C
C Functional form of linear background and Breit-Wigner peak.
C
      REAL M
      COMMON/PAWPAR/ B(6)       ! Parameters from PAW
C                                                       ! (created by "v/cre B(6) r <values>")
C
C Quadratic Polynomial Background.
        BACKGROUND = B(1) + B(2)*M + B(3)*M**2
C
C Breit-Wigner Peak times A(4).
        BREITWIGNER = B(4)*B(5)*B(6)/((M**2-B(5)**2)**2+(B(5)*B(6))**2)
C
C The combination.
        BWFIT = BACKGROUND + BREITWIGNER
C
      RETURN
      END

This uses parameters in vector B (dimension 5) and evaluates the second order polynomial background and the Breit-Wigner peak and forms a linear combination of them in BWFIT at an arbitrary value of Lambda K- mass M. This function is called during the fitting process probably thousands of times while the parameters in B are being optimised.

Example Using a FORTRAN Routine:

Perhaps we wanted the background within an arbitrary range "S" of the peak.

PAW > call backg.for(0.01)

This will use the code in file $KUMAC/backg.for to compute background within 0.01 GeV from the peak. The file contains the code:

      SUBROUTINE BACKG(S)
C
C Compute the background under the peak.
C
      REAL S, LOW, HIGH, DX
      VECTOR B          ! Special decaration of vector created in PAW by
C                         "PAW > v/cre A(5) ...etc."
C                       ! (now optimised by the FIT)
      VECTOR C(3)       ! Declaration for the vector containing the
C                         low x, high x and number of bins respectively.
C
C Quadratic Background is B(1) + B(2)*M + B(3)*M**2
C Integrate from peak-S to peak+S
      LOW = B(5)-S
      HIGH = B(5)+S
C Limit the limits to be within the plot boundary:
      IF(LOW.lt.C(1) .or. S.eq.0.0) LOW = C(1)
      IF(HIGH.gt.C(2) .or. S.eq.0.0) HIGH = C(2)
      DX = (C(2)-C(1))/C(3)
      BACKG = (B(1)*(HIGH-LOW)
     A      + 0.5*B(2)*(HIGH**2 - LOW**2)
     A      + 1/3*B(3)*(HIGH**3 - LOW**3)) / DX
      PRINT *, 'Background has', BACKG,' events from',LOW,' to',HIGH
C
      RETURN
      END

Plotting Functions

PAW will draw arbitrary functions which may be defined on the command line:

PAW > fun/pl (5*cos(x)+3*sin(x))*exp(-.5*x) 0 10

to plot the indicated function in the range from x=0 to x=10.

PAW > fun/pl $eval(b(1))+$eval(b(2))*x+$eval(b(3))*x**2 $hinfo(100,'xmin') $hinfo(100,'xmax') s

will plot the background superimposed on the last plot drawn from low to high abscissa.
[Note that PAW is clumsy and does not allow use of vector elements in formulae unless they are evaluated by a $eval()]

Functions can also be drawn using fortran routines to define them:

PAW > fun/pl $KUMAC/backg.for(x) 0.001 0.1

which should plot the computed background within limits from 1 MeV up to 100 MeV (actually the whole plot) from the peak value..

Exercise 5:

Use the Cascade sample and plot "XM" for the selection "CHGO<0" in a histogram with 50 bins from 1.3 GeV to 1.35 GeV.

Make a fit with Gaussian peak on linear background.

Use the fit to determine how many background events there are. Print this number with the number of signal events printed on the same line.

[The number of events in the signal can be taken to be the $hinfo(id,'SUM') minus the background.]

Now repeat the sequence above making the selection CATK<1000. (Use the up arrow - perhaps the <CTRL P> key!)

Try other selections to find how effective they are in reducing the background while preserving the signal.

Prof Brian T. Meadows
Physics Department, ML 11
Universtiy of Cincinnati
Cincinnati, OH 45221-0011

tel: 513 556-0531
fax: 513 556-3425
E-mail: Brian.Meadows@UC.Edu

The CERN PAW Program

and "ntuples"

What is an "ntuple"?

Example: The Omega ntuple

/home/hepdisk3/briandata/omega.paw

Variable #

Name

What it is.

Another Example - the Cascade ntuple with ID=3000

/home/hepdisk3/briandata/cascade.paw

Tutorial Using the Omega paw file

Login to the UNIX network as advlab1 using the password provided and go to the directory where the ntuple is:

Run the paw program:

Open the ntuple file for input:

Print out the contents of the ntuple (you will see all that is in the table above and more):

Plot the Lambda K- effective mass for Omega- candidates:

Plot anti-Lambda K+ effective mass for anti-Omega candidates.

Find out more about "n/pl"

Make a histogram with 50 bins starting at 1.65 and ending at 1.70

Now plot the Lambda K effective mass in this histogram:

Plot the anti-Omega (dashed) superimposed on this histogram.

Make a printout of these plots:

Change the appearance of this plot:

Now plot these two histograms side by side:

Repeat, but make them appear one above the other:

Exercise 1:

Exercise 2:

Now look at the Cascade ntuple.

Q: What do you notice? Does this make sense?

Q: What do you notice? Does this make sense?

Fitting

What a fit is:

What kinds of fits PAW is capable of:

~brian/E791/Cascade/xeval.kumac

Look for further HELP on this subject in PAW as follows:

Make a fit to a Polynomial background of order 2 plus a Gaussian:

Now do the FIT:

Exercise 3:

Q: What do you see? Explain it in terms of the spectrometer layout and where the SMD's are.

Using the Results of a Fit

More Complicated Example:

Exercise 4: (Write this up)

More Complicated Fits and examples - FORTRAN.

Example Using a FORTRAN Routine:

Plotting Functions

Exercise 5: