The CERN PAW Program
and "ntuples"
This tutorial is relatively brief. Another useful
tutorial
is available from CERN.
What is an "ntuple"?
Events from the experiment are reconstructed, filtered and eventually written
to an "ntuple" file. This file contains an entry for each interpretation
of an event (there may be several for a single event!) The entry contains
a few words real variables) containing pertinent quantities for this interpretation.
This is called an "ntuple."
Different ntuples could be written for each event, each containing a different
set of quantities. An ID number is associated with each ntuple.
Example: The Omega ntuple
The following quantities are stored for each interpretation of an event as
an Omega. (NOTE that before writing this file, a vast number of events
with no conceivable interpretation as Omega had already been rejected.)
The ntuple records are in a file on the UNIX network:
/home/hepdisk3/briandata/omega.paw
To access this enter the following unix command:
ln -s /home/hepdisk3/briandata/omega.paw omega_data
The 17 Most Important Quantities in Omega Ntuple
with ID number 2000
Variable #
|
Name
|
What it is.
|
1 |
NUMOM |
Number of interpretations for this topology
for this event. |
2 |
CHGO |
Charge of Omega. (-1. for Omega, +1. for
anti-Omega) |
3 |
CatK |
Category for "K" track. If >1000 track
does not appear in the SMD's |
6 |
Qual |
dtheta2 + dphi2. (radians
2)
Angular agreement between Lambda+K momentum and direction of Omega-
track in the SMD's |
7 |
KCpi |
Cherenkov probability for "K" track to be a
pion |
8 |
KCK |
Cherenkov probability for "K" track to be a
K |
9 |
Lmz |
z position of Lambda decay vertex. |
10 |
Prmz |
z position of primary (production) vertex. |
13 |
Omz |
z position of Omega- decay vertex. |
15 |
LM |
Proton(1)+pi(2) effective mass - sqrt of (E1+E2)
2-|p1+p2|2 |
16 |
XM |
Lambda(1)+pi(2) effective mass - sqrt of (E1+E2)
2-|p1+p2|2 |
17 |
OM |
Lambda(1)+K(2) effective mass - sqrt of (E1+E2)
2-|p1+p2|2 |
19 |
POM |
|p| of Omega- (GeV/c) |
26 |
DIPO |
Distance of closest approach of SMD Omega
- track to production vertex. |
27 |
DIPK |
Distance of closest approach of K-
track to production vertex. |
14 |
ISMD |
Positive if Omega decays downstream from SMD's;
Negative if Omega decays within SMD's. |
28 |
Nprm |
Number of tracks at primary vertex |
Another Example - the Cascade ntuple with ID=3000
A further ntuple available for this experiment contains entries of event
interpretations as Cascades. These events differ from the Omega's in
that Omega- decays to Lambda and K- while Cascade
- decays to Lambda and pi-. This ntuple is in the
area
/home/hepdisk3/briandata/cascade.paw
To access this enter the following unix
command:
ln -s /home/hepdisk3/briandata/omega.paw
xi_data
(alias $DATA/cascade.paw if you are logged in as "advlab1".)
The table of contents is similar to that for Omegas except that the names
of the variables often contain an "X" rather than an "O".
Tutorial Using the Omega paw file
NOTE - In the following examples of PAW commands:
PAW does not distinguish upper and lower case
Commands may exceed the screen's line width. They automatically
wrap onto next line.
In addition to the Exercises below, devise some of your own.
HELP <command> will display a file on your screen in an editor.
To read the file, these commands are used:
It is recommended you leave the CURSOR on the bottom (command)
line of the HELP screen. "Enter" will place it there. |
|
To EXIT the HELP screen |
F3, <Esc> 3, or qq<Return> |
To go to next page |
F7 or <Esc> 7 |
To go to next page |
F8 or <Esc> 8 |
Login to the UNIX network as advlab1 using the password provided
and go to the directory where the ntuple is:
- login: advlab1
- password: abcxyz?
Run the paw program:
- paw <Enter> then another <Enter>.
- A graphic window should open and a prompt "PAW > " should appear
in your command window. If not, ask Meadows.
Open the ntuple file for input:
- PAW > h/file 1 omega_data
Print out the contents of the ntuple (you will see all that is in
the table above and more):
Plot the Lambda K- effective mass for Omega- candidates:
- PAW > n/pl 2000.OM CHGO<0
Plot anti-Lambda K+ effective mass for anti-Omega candidates.
- PAW > n/pl 2000.OM CHGO>0
Find out more about "n/pl"
Make a histogram with 50 bins starting at 1.65 and ending at 1.70
- PAW > 1d 100 'Effective Mass of Omega Decay Products' 40 1.65 1.70
Now plot the Lambda K effective mass in this histogram:
- PAW > n/pl 2000.om chgo<0 -100
Plot the anti-Omega (dashed) superimposed on this histogram.
- PAW > n/pl 2000.om chgo>0 ! ! ! s
Make a printout of these plots:
- PAW > exec ops
! This runs a macro file "ops.kumac" which will make all future plots become
-
! appended to a disk file "last.ps"
- PAW > n/pl 2000.om chgo<0 -100
! Now, in addition to appearing on the screen, this plot gets added to "last.ps"
- PAW > n/pl 2000.om chgo>0 ! ! ! s
! Superimpose the anti-omega plot
- PAW > exec cps
! This runs another macro file "cps.kumac" which closes "last.ps" and turns
off
! the simultaneous process of appending plots to it. - PAW > sh lpr
-Pduplex ~/kumac/last.ps !
This executes the shell command to print "last.ps" (a local file in the
-
! directory ~/kumac) on printer duplex (in room 400)
-
! Other printer include 'rm400' (single side printing) 'rm439', etc..
Change the appearance of this plot:
- PAW > n/pl 2000.om chgo>0 -100
- set htyp 2
- PAW > n/pl 2000.om chgo>0 ! ! s
Now plot these two histograms side by side:
- zone 2 1 ;(first number is number of columns,
second is number of rows.)
- PAW > n/pl 2000.om chgo>0 -100
- PAW > n/pl 2000.om chgo>0 ! ! s
Repeat, but make them appear one above the other:
- zone 1 2
- PAW > n/pl 2000.om chgo>0 -100
- PAW > n/pl 2000.om chgo>0 ! ! ! s
Exercise 1:
Attempt to make a set of cuts on the following quantities which will reduce
the background without significantly reducing the signal. (Some signal
loss is inevitable, but hold it to less than 10% if you can while reducing
the background by at least a factor two.)
CATK
DIPO
KCK
NUMOM
Exercise 2:
Now look at the Cascade ntuple.
Open the cascade file and close the omega one:
PAW > close 1
! Close the Omega file
PAW > h/file 1 xi_data
! and open the cascade one.
Make a histogram ID 300 with 90 bins starting at 1.300 ending at
1.345 and plot Lambda+pi effective mass ("XM") for the two disjoint samples:
ISMD>0 (ie cascade decays downstream from the SMD's)
and
ISMD<0 (ie cascade decays before end of SMD's).
Make these plots side by side on the screen.
Q: What do you notice? Does this make sense?
Repeat the above, but in each histogram superimpose also the subsamples
with CATK>1000.
Q: What do you notice? Does this make sense?
Fitting
What a fit is:
PAW is able to fit histograms with any function f(x;a) where
x is the abscissa of the histogram and a is a set
of parameters which are required in the definition of f.
If the quantity x is histogrammed and f(x;a)
is the form it should take [ie the number of entries
y(x) in the bin from x to x+delta_x should be F(x;
a) = Integral f(x;a)dx from x to x+delta_x ] then a
"fit" will adjust all the a's so as to minimize the quantity:
where sigma_y is the uncertainty on y(x). (PAW
will take this to be the square root of y(x) - the uncertainty expected
for this number of entries from Poisson statistics.)
A variation of this ("minimum Chi squared") method involves maximizing
the quantity:
Log_Likelihood = y(x)*Log[f(x;a)] (summed
over the bins)
This is the "Log-Likelihood" method. This works if the function is
defined in such a way that its integral over the x range of the histogram
does not depend upon the parameters a.
What kinds of fits PAW is capable of:
PAW is equipped to use either minimum chi squared or maximum likelihood methods
to make fits (see the HELP file on h/fit). The form for f(x;a
) can be taken from a choice of three standard forms: polynomial
in x;
exponential in x or
Gaussian in x
with parameters a (see the example below) to define the precise
form. In this example, a further feature of PAW - its ability to combine
the first two of the above forms - is exploited.
PAW can also fit a user defined function. This requires that
you write a FORTRAN function and save it as a file. An example of this
can be seen in the file in
~brian/E791/Cascade/xeval.kumac
where a number of PAW commands are stored as well as the FORTRAN function
to define a Breit-Wigner peak shape upon a polynomial background. This
is a more advanced application of PAW.
Look for further HELP on this subject in PAW as follows:
If you have any problem, type help and follow "Histogram" (4) and "Fit" (9)
Make a fit to a Polynomial background of order 2 plus a Gaussian:
a(1)+a(2)*x+a(3)*x**2 + a(4)*exp(-0.5*((x-a(5))/a(6))**2)
....(1)
Clearly this requires 6 parameters and here is how you define them:
- PAW > vector/create a(6) R 0 0 0 0 1.672 0.004
This creates a REAL array "a" dimension 6 with data 1.0, 1.0, etc. (i.e
. a(1)=a(2)=a(3)=a(4)=1.0, a(5)=1.672, a(6)=0.004). The first 4 values
are not important to PAW as it will easily fit the polynomial parameters and
the fraction of Gaussian on top. The last two numbers are the Mass
and Resolution for the Omega peak respectively, and it is necessary to give
PAW a good start on these values.
Now do the FIT:
- PAW > h/fit 100 P2+G ! 6 a
That is all there is to it!
Exercise 3:
Find the Mass and resolution for Cascade minus and compare them for
anti-Cascade.
Repeat for cleaner Omega and anti Omega samples using the "cuts":
Try the "disjoint sample" having
Q: What do you see? Explain it in terms of the spectrometer
layout and where the SMD's are.
Using the Results of a Fit
In the above example, the results of the fit are stored in the PAW vector
"a". Suppose you want to compute the area under the peak (as opposed
to those in the background part.) From the context of equation (1) above
we see that a(5) is the mean (or peak) value for the Gaussian and a(6) is
the standard deviation. The area under the Gaussian is a(4)*a(6)*sqrt(2*pi).
To obtain this number:
PAW > mess $sigma(a(6)*a(4)*sqrt(2*pi))
In this command the "mess" instruction asks PAW to print a message.
The $sigma() part tells a subsystem (named "sigma") to evaluate the expression
within the parentheses. To obtain more information on "sigma" type
PAW > help sigma
! (or look at the CERN tutorial.)
When you do as above, you will be surprised to find that the result is much
smaller than you would expect from looking at your plot. The "mess"
command will result in a number like 7 while it appears from the plot as
if there are thousands of events in the peak. The problem is that to
obtain the number of events, the curve needs to be divided by the bin size
of the histogram. (Why is this?)
So - you will need the binsize for histogram ID 100. PAW
is capable of providing information about any plot. Various functions
"$HINFO" give this information and you will find these with
PAW > help k/functions
You will see a variety of functions, but the one you really need - binsize
- is not directly available. You can however compute it from:
PAW > mess $sigma(($hinfo(100,'xmax')-$hinfo(100,'xmin'))/$hinfo(100,'xbins'))
(Note that PAW is not case sensitive like unix. Also note the balanced
parentheses.)
Now see what the fit you made says of the number of events in the peak:
PAW > mess $sigma(a(4)*a(6)*sqrt(2*pi))*$hinfo(100,'xbins')/($hinfo(100,'xmax')-$hinfo(100,'xmin'))
(This command should all be on one line.) You will probably obtain
a result like
7.072*50/(1.65-1.70)
and this is not quite what you want! So try
PAW > mess $eval($sigma(a(4)*a(6)*sqrt(2*pi))*$hinfo(100,'xbins')/($hinfo(100,'xmax')-$hinfo(100,'xmin')))
Finally, the result will now be the number of events in the peak!
More Complicated Example:
Suppose you want to compute the amount of background in the plot. Clearly
this is the integral from low to high abscissae of the second order polynomial.
PAW is a bit clumsy, so it will be convenient for various
reasons to make use of another vector "c" as follows:
PAW > v/cre c(3) r
PAW > v/input c(1) $hinfo(100,'xmin')
PAW > v/input c(2) $hinfo(100,'xmax')
PAW > v/input c(3) $hinfo(100,'xbins')
Make sure you have set up the plot and made the fit:
PAW > n/pl 2000.om chgo<0 -100
PAW > v/cre b(6) r 0 0 0 0 1.672 0.004
PAW > h/fit 100 $KUMAC/bwfit.for ! 6 b
Now do the background calculation:
PAW > v/input c(4) $sigma(b(1)
+ 1/2*b(2)*(c(2)+c(1)) + 1/3*b(3)*(c(2)**2+3*c(1)*c(2)+c(1)**2))
PAW > mess $eval(c(3)*c(4))
Exercise 4: (Write this up)
Find the asymmetry between production of Omega and anti Omega.
- Use as clean a sample (with the same cuts in both cases!) as possible.
Find the asymmetry between production of Cascade and anti Cascade:
- Once again, use as clean a sample (with the same cuts in both cases!)
as possible.
Comment on the two results.
Use the errors from the fit to estimate the uncertainties in
these asymmetries.
Investigate the effects of the cuts on the uncertainties.
Define three sets of cuts:
- Little or no cutting (big signal and big background)
- Good cuts like those you found above (only slightly reduced signal
and greatly reduced background)
- Exaggerated cuts (signal reduced more than necessary, but background
also cut more.)
More Complicated Fits and examples - FORTRAN.
Sometimes it is necessary to make a complex calculation. Perhaps a
fit requires a function different from those provided by the "h/fit" command.
For example we may wish to fit a Breit-Wigner peak rather than a Gaussian
to the Omega- in histogram ID 100 of the example above.
This can be done using a FORTRAN subroutine "BWFIT(M)"
defined in a file "$KUMAC/bwfit.for" on your disk. To use this to make
the required fit:
First define the parameters with starting values.
These must be REAL and correspond in order defined to those in the function
to be fit (see the file below.)
PAW > v/cre b(6) r 0 0 0 0 1.672
0.004
Now make the fit
PAW > h/fit 100 $KUMAC/bwfit.for
! 6 b
The vector "b" will now contain optimised values for the parameters.
As you will now see, b(5) is the Breit-Wigner peak mass and b(6) is the "natural
width" - the full width at half maximum of the peak shape.
The file contains the FORTRAN code
FUNCTION BWFIT(M)
C
C Functional form of
linear background and Breit-Wigner peak.
C
REAL M
COMMON/PAWPAR/ B(6) ! Parameters from
PAW
C
! (created by "v/cre B(6) r <values>")
C
C Quadratic Polynomial
Background.
BACKGROUND = B(1) + B(2)*M + B(3)*M**2
C
C Breit-Wigner Peak times
A(4).
BREITWIGNER = B(4)*B(5)*B(6)/((M**2-B(5)**2)**2+(B(5)*B(6))**2)
C
C The combination.
BWFIT = BACKGROUND + BREITWIGNER
C
RETURN
END
This uses parameters in vector B (dimension 5) and evaluates the second order
polynomial background and the Breit-Wigner peak and forms a linear combination
of them in BWFIT at an arbitrary value of Lambda K- mass M. This function
is called during the fitting process probably thousands of times while the
parameters in B are being optimised.
Example Using a FORTRAN Routine:
Perhaps we wanted the background within an arbitrary range "S" of the peak.
PAW > call backg.for(0.01)
This will use the code in file $KUMAC/backg.for to compute background within
0.01 GeV from the peak. The file contains the code:
SUBROUTINE BACKG(S)
C
C Compute the background
under the peak.
C
REAL S, LOW, HIGH, DX
VECTOR B ! Special
decaration of vector created in PAW by
C
"PAW > v/cre A(5) ...etc."
C
! (now optimised by the FIT)
VECTOR C(3) ! Declaration for the vector
containing the
C
low x, high x and number of bins respectively.
C
C Quadratic Background
is B(1) + B(2)*M + B(3)*M**2
C Integrate from
peak-S to peak+S
LOW = B(5)-S
HIGH = B(5)+S
C Limit the limits
to be within the plot boundary:
IF(LOW.lt.C(1) .or. S.eq.0.0) LOW = C(1)
IF(HIGH.gt.C(2) .or. S.eq.0.0) HIGH = C(2)
DX = (C(2)-C(1))/C(3)
BACKG = (B(1)*(HIGH-LOW)
A + 0.5*B(2)*(HIGH**2 - LOW**2)
A + 1/3*B(3)*(HIGH**3 - LOW**3)) / DX
PRINT *, 'Background has', BACKG,' events from',LOW,' to',HIGH
C
RETURN
END
Plotting Functions
PAW will draw arbitrary functions which may be defined on the command line:
PAW > fun/pl (5*cos(x)+3*sin(x))*exp(-.5*x)
0 10
to plot the indicated function in the range from x=0 to x=10.
PAW > fun/pl $eval(b(1))+$eval(b(2))*x+$eval(b(3))*x**2
$hinfo(100,'xmin') $hinfo(100,'xmax') s
will plot the background superimposed on the last plot drawn from low to
high abscissa.
[Note that PAW is clumsy and does not allow use of vector elements in
formulae unless they are evaluated by a $eval()]
Functions can also be drawn using
fortran routines to define them:
PAW > fun/pl $KUMAC/backg.for(x)
0.001 0.1
which should plot the computed background within limits from 1 MeV up to
100 MeV (actually the whole plot) from the peak value..
Exercise 5:
Use the Cascade sample and plot "XM" for the selection "CHGO<0"
in a histogram with 50 bins from 1.3 GeV to 1.35 GeV.
Make a fit with Gaussian peak on linear background.
Use the fit to determine how many background events there are.
Print this number with the number of signal events printed on the same line.
[The number of events in the signal
can be taken to be the $hinfo(id,'SUM') minus the background.]
Now repeat the sequence above making the selection CATK<1000.
(Use the up arrow - perhaps the <CTRL P> key!)
Try other selections to find how effective they are in reducing the
background while preserving the signal.
Prof Brian T. Meadows
Physics Department, ML 11
Universtiy of Cincinnati
Cincinnati, OH 45221-0011 |
tel: 513 556-0531
fax: 513 556-3425
E-mail: Brian.Meadows@UC.Edu
|
|
|