
title: "Creating R Classes (S3): Regression Contexts or Similar"
author: "Paul E. Johnson"
date: '20150206'
output:
slidy_presentation:
toc: true
highlight: haddock
smart: false
fig_width: 7
fig_height: 5
fig_caption: true
font_adjustment: 1

## Navigation Guide in a Web Browser
 These work in Firefox on Ubuntu as of 20150204
 'C' Show table of contents
 'S' Make fonts smaller
 'B' Make fonts larger
 Documentation says these ought to work, but I have intermittent success
 'F' Toggles the display of the footer
 'A' Toggles display of current vs all slides (useful for printing handouts)
## Abstract
 People say that R is "objectoriented," but I had not appreciated
the underlying details until 2013 or so (after using R for 12+
years)
Now I notice the pattern!
 Here's the basic recipe
 estimator function creates data structures (not presentable to
humans) and declares that thing to be of a certain class.
 programmer then writes summary and other methods that
restructure the data.
 programmer writes display methods: print, plot, for those
revised data structures.
 Behind the scene, information is passed around in a relatively ugly format
 We show the users a prettiedup version.
 Want examples I can explain?
 Look in rockchalk
 meanCenter, residualCenter, standardize. functions that
receive regressions, reassign their classes.
 pctable, a newer function discussed next
# Table example
## Need presentable cross tabulations, pronto! {.allowframebreaks}
In the "secure room", we wanted cross tabulations. "Check this, check that".
 R table output too sparse
```r
table(dat$yf1, dat$xf1)
```
```
##
## Alberquerque Denver Santa Fe
## a 6 9 9
## b 8 13 6
## c 12 12 10
## d 13 6 10
## e 6 13 10
```
 What's wrong with that?
 no margins
 no percents
 Sure, I know about adding those things (prop.table, addmargins), but
I needed a oneanddone function.
## Clunky to use, too {.allowframebreaks}
Compared to other R functions, there is no beauty in the table function.
```r
table(dat$yf1, dat$xf1)
```
I wanted something that looked more like a regression model
```
myTable(yf1 ~ xf1, data = dat)
```
 I wanted to replace this style
```r
table(dat$yf1, dat$xf1, dnn = list("My Great Row Var", "My Column Var"))
```
```
## My Column Var
## My Great Row Var Alberquerque Denver Santa Fe
## a 6 9 9
## b 8 13 6
## c 12 12 10
## d 13 6 10
## e 6 13 10
```
with this:
```
myTable(yf1 ~ xf1, data = dat, rvlab = "My Great Row Var", cvlab = "My Column Var")
```
## The function tabl (for the NLTS2 project) did about the right thing.
```r
tabl(yf1, xf1, data = dat)
```
```
## xf1
## yf1 Alberquerque Denver Santa Fe NA Sum
## a 6(12.8%) 9(16.7%) 9(20%) 0(0%) 24
## b 8(17%) 13(24.1%) 6(13.3%) 1(25%) 28
## c 12(25.5%) 12(22.2%) 10(22.2%) 0(0%) 34
## d 13(27.7%) 6(11.1%) 10(22.2%) 1(25%) 30
## e 6(12.8%) 13(24.1%) 10(22.2%) 2(50%) 31
## NA 2(4.3%) 1(1.9%) 0(0%) 0(0%) 3
## Sum 47 54 45 4 150
```
Declaration like so
```
tabl < function(x, y, data = parent.frame(), xlab = NULL,
ylab = NULL, exclude = NULL, rounded = FALSE)
```
 That was not very flexible, but went in good direction
 data frame argument!
 Some argument magic here. Allows users to name symbols as input
variables, not quoted variable names "xf1". That gets sacrificed
in the sequel.
 Defaults to show missing values
## pctable() in the rockchalk package 1.8.90+ {.allowframebreaks}
 Much more flexible
 use a formula interface
```
t1 < pctable(yf1 ~ xf1, dat)
```
creates a data structure in t1 that can be used to spit out various summaries
 Or even use it the old way
```
pctable(dat$yf1, dat$xf1)
```
 It turns out it is not allowed/practical to have this interface,
```
pctable(yf1, xf1, dat)
```
 However, I found a way to implement this
```
pctable("yf1", "xf1", dat)
```
which is almost as good.
 Found ways to export to LaTeX and Word compatible formats, so students can make tables that don't offend the reader
 Keep options open to include counts and percents in same table
## pctable is pronounced "presentable"{.allowframebreaks}
```r
library(rockchalk)
pctable(yf1 ~ xf1, data = dat, rvlab = "A Thing I Predict", cvlab = "A Mighty Predictor!")
```
```
## Count (column %)
## A Mighty Predictor!
## A Thing I Predict Alberquerque Denver Santa Fe Sum
## a 6(13.3%) 9(17%) 9(20%) 24
## b 8(17.8%) 13(24.5%) 6(13.3%) 27
## c 12(26.7%) 12(22.6%) 10(22.2%) 34
## d 13(28.9%) 6(11.3%) 10(22.2%) 29
## e 6(13.3%) 13(24.5%) 10(22.2%) 29
## Sum 45 53 45 143
```
## In my opinion, nobody should ever want row percents, but if you do, I will still be your friend:
```r
pctable(yf1 ~ xf1, data = dat, rowpct = TRUE, colpct = FALSE)
```
```
## Count (row %)
## xf1
## yf1 Alberquerque Denver Santa Fe Sum
## a 6(25%) 9(37.5%) 9(37.5%) 24
## b 8(29.6%) 13(48.1%) 6(22.2%) 27
## c 12(35.3%) 12(35.3%) 10(29.4%) 34
## d 13(44.8%) 6(20.7%) 10(34.5%) 29
## e 6(20.7%) 13(44.8%) 10(34.5%) 29
## Sum 45 53 45 143
```
## Make a high quality table

xf1 
yf1 
Alberquerque 
Denver 
Santa Fe 
NA 
Sum 
a 
6(12.8%) 
9(16.7%) 
9(20%) 
0(0%) 
24 
b 
8(17%) 
13(24.1%) 
6(13.3%) 
1(25%) 
28 
c 
12(25.5%) 
12(22.2%) 
10(22.2%) 
0(0%) 
34 
d 
13(27.7%) 
6(11.1%) 
10(22.2%) 
1(25%) 
30 
e 
6(12.8%) 
13(24.1%) 
10(22.2%) 
2(50%) 
31 
NA 
2(4.3%) 
1(1.9%) 
0(0%) 
0(0%) 
3 
Sum 
47 
54 
45 
4 
150 
## What's required to make this work?{.allowframebreaks}
From rockchalk source code, this is it!
```r
pctable < function(rv, ...)
{
UseMethod("pctable")
}
pctable.formula < function(formula, data = NULL, rvlab = NULL,
cvlab = NULL, colpct = TRUE, rowpct = FALSE,
rounded = FALSE, ...)
{
if (missing(formula)  (length(formula) != 3L))
stop("pctable requires a two sided formula")
mt < terms(formula, data = data)
if (attr(mt, "response") == 0L) stop("response variable is required")
mf < match.call(expand.dots = TRUE)
mfnames < c("formula", "data", "subset", "xlev", "na.action", "drop.unused.levels")
keepers < match(mfnames, names(mf), 0L)
mf < mf[c(1L, keepers)]
## mf$drop.unused.levels < FALSE
mf[[1L]] < quote(stats::model.frame)
dots < list(...)
## remove used arguments from dots, otherwise errors happen
## when unexpected arguments pass through. Don't know why
for (i in c("subset", "xlev", "na.action", "drop.unused.levels")) dots[[i]] < NULL
mf < eval(mf, parent.frame())
mfnames < names(mf)
response < attr(attr(mf, "terms"), "response")
## response is column 1
rvname < mfnames[response]
cvname < mfnames[response][1] ##just take 2?
rvlab < if (missing(rvlab)) rvname else rvlab
cvlab < if (missing(cvlab)) cvname else cvlab
arglist < list(rv = mf[[rvname]], cv = mf[[cvname]],
rvlab = rvlab, cvlab = cvlab,
colpct = colpct, rowpct = rowpct,
rounded = rounded)
arglist < modifyList(arglist, dots, keep.null = TRUE)
##keep.null needed because exclude = NULL can be a valid
## (meaningful) argument to table.
res < do.call(pctable.default, arglist)
invisible(res)
}
pctable.character < function(rv, cv, data = NULL, rvlab = NULL,
cvlab = NULL, colpct = TRUE,
rowpct = FALSE,
rounded = FALSE, ...)
{
if (missing(data)  !is.data.frame(data)) stop("pctable requires a data frame")
cv < as.character(substitute(cv))[1L]
rvlab < if (missing(rvlab)) rv else rvlab
cvlab < if (missing(cvlab)) cv else cvlab
res < pctable.formula(formula(paste(rv, " ~ ", cv)), data = data,
rvlab = rvlab, cvlab = cvlab, colpct = colpct,
rowpct = rowpct, rounded = rounded, ...)
invisible(res)
}
pctable.default < function(rv, cv,
rvlab = NULL, cvlab = NULL,
colpct = TRUE, rowpct = FALSE,
rounded = FALSE, ...)
{
rvlabel < if (!missing(rv)) deparse(substitute(rv))
cvlabel < if (!missing(cv)) deparse(substitute(cv))
rvlab < if (is.null(rvlab)) rvlabel else rvlab
cvlab < if (is.null(cvlab)) cvlabel else cvlab
dots < list(...)
dotnames < names(dots)
## altargs < list()
## if ("dnn" %in% dotnames) altargs$dnn < dots[["dnn"]]
## if ("deparse.level" %in% dotnames)
## altargs$deparse.level < dots[["deparse.level"]]
tableargs < list(rv, cv, dnn = c(rvlab, cvlab))
newargs < modifyList(tableargs, dots, keep.null = TRUE)
t1 < do.call("table", newargs)
rownames(t1)[is.na(rownames(t1))] < "NA" ## symbol to letters
colnames(t1)[is.na(colnames(t1))] < "NA"
if (rounded) t1 < round(t1, 1)
t2 < addmargins(t1, c(1,2))
t1colpct < round(100*prop.table(t1, 2), 1)
t1rowpct < round(100*prop.table(t1, 1), 1)
t1colpct < apply(t1colpct, c(1,2), function(x) gsub("NaN", "", x))
t1rowpct < apply(t1rowpct, c(1,2), function(x) gsub("NaN", "", x))
res < list("count" = t2, "colpct" = t1colpct, "rowpct" = t1rowpct,
call = match.call())
class(res) < "pctable"
print.pctable(res, colpct = colpct, rowpct = rowpct)
invisible(res)
}
print.pctable < function(x, colpct = TRUE, rowpct = FALSE, ...)
{
colpct < if (missing(colpct)) x$call[["colpct"]] else colpct
rowpct < if (missing(rowpct)) x$call[["rowpct"]] else rowpct
tab < summary(x, colpct = colpct, rowpct = rowpct)
print(tab, ...)
invisible(tab)
}
summary.pctable < function(object, ..., colpct = TRUE, rowpct = FALSE)
{
colpct < if (missing(colpct)) object$call[["colpct"]] else colpct
rowpct < if (missing(rowpct)) object$call[["rowpct"]] else rowpct
count < object[["count"]]
t3 < count
attr(t3, which = "colpct") < colpct
attr(t3, which = "rowpct") < rowpct
class(t3) < c("summary.pctable", "table")
if (colpct && !rowpct) {
cpct < object[["colpct"]]
for(j in rownames(cpct)){
for(k in colnames(cpct)){
t3[j, k] < paste0(count[j, k], "(", cpct[j, k], "%)")
}
}
return(t3)
}
## rowpct == TRUE else would have returned
rpct < object[["rowpct"]]
for(j in rownames(rpct)){
for(k in colnames(rpct)){
t3[j, k] < paste0(count[j, k], "(", rpct[j, k], "%)")
}
}
if (!colpct) {
return(t3)
} else {
cpct < object[["colpct"]]
t4 < array("", dim = c(1, 1) + c(2,1)*dim(object$colpct))
t4[seq(1, NROW(t4), 2), ] < t3
rownames(t4)[seq(1, NROW(t4), 2)] < rownames(t3)
rownames(t4)[is.na(rownames(t4))] < ""
colnames(t4) < colnames(t3)
for(j in rownames(object[["colpct"]])) {
for(k in colnames(object[["colpct"]])){
t4[1 + which(rownames(t4) == j) ,k] < paste0(object[["colpct"]][j, k], "%")
}
}
t4 < as.table(t4)
names(dimnames(t4)) < names(dimnames(count))
attr(t4, which = "colpct") < colpct
attr(t4, which = "rowpct") < rowpct
class(t4) < c("summary.pctable", "table")
return(t4)
}
}
print.summary.pctable < function(x, ...){
colpct < attr(x, "colpct")
rowpct < attr(x, "rowpct")
if (colpct && !rowpct) {
cat("Count (column %)\n")
} else if (!colpct && rowpct) {
cat("Count (row %)\n")
} else {
cat("Count (row %)\n")
cat("column %\n")
}
NextMethod(generic = "print", object = x, quote = FALSE, ...)
}
```
# Why does this work?
## pctable is designed with the R secret recipe in mind.
 pctable creates a set of data structures
 summary.pctable can create a summary view, which picks through the structures to make the desired information.
 print.summary.pctable displays the summary to the user
## Same sequence plays itself out across the R source code
 User runs a modelmaking function
```
m1 < lm(y1 ~ xf1 + x2, data = dat)
```
 R "knows what to do" when we pass m1 is used as an argument to other functions. We run
```
summary(m1)
```
or
```
plot(m1)
```
 Results come not from functions named summary, or plot, but rather from functions named summary.lm or plot.lm. Those special, classspecific functions are called methods
## Attributes
 m1 has an attribute named "class", which is set to the value of "lm".
 R runtime system depends on the class attribute.
 See all attributes
```
attributes(m1)
```
 Or just the one attribute we want to know
```
class(m1)
```
## R Functions set Attributes as class strings. {.allowframebreaks}
Check any of the major model fitting functions in the R source code.
 They build up a list of results (estimates, other recordkeeping elements).
 They then set attributes on that list
 Toward the end, as in the lm() function, one will find:
```
class(z) < c(if (is.matrix(y)) "mlm", "lm")
```
 If the model is multivariate, the class vector will be c("mlm", "lm"), otherwise just "lm".
 More specialized classes are appended on the front of the list.
 Something classed as c("mlm", "lm") can have specialized functions
for "mlm" objects where needed, but can use functions intended for
"lm" objects when suitable.
## The S3 system is very informal
 Can set class attribute to anything you want
 Unlike more structured languages like C++ or Java, there is no
typechecking to make sure this makes sense.
 No need to "register" the class with the runtime system in any more elaborate way.
 For existing R generic functions like summary, plot, print, table, and so forth, one need only create functions named summary.mynewclass, print.mynewclass, table.mynewclass, etc.
## Create some mischief, version 1.
 If you lie to the R runtime by claiming something is an instance of something else, then you generally get errors that seem somewhat opaque.
```r
m1 < c(1, 2, 3)
class(m1) < "lm"
> summary(m1)
```
```
Error: $ operator is invalid for atomic vectors
```
## Create some mischief, version 2 {.allowframebreaks}
 Here's a real lm regression object
```r
m1 < lm(y ~ xf1, data = dat)
```
 Add a new class on the front.
```r
class(m1) < c("pjgreatest", class(m1))
class(m1)
```
```
## [1] "pjgreatest" "lm"
```
 The "pjgreatest" class has no methods, so anything we run will "still work" because R looks at the class list. It finds no method "summary.pjgreatest", but it finds the Next Method "summary.lm". See?
```r
summary(m1)
```
```
##
## Call:
## lm(formula = y ~ xf1, data = dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## 3.417 0.673 0.041 0.744 2.468
##
## Coefficients:
## Estimate Std. Error t value Pr(>t)
## (Intercept) 0.0624 0.1543 0.40 0.69
## xf1Denver 0.1720 0.2110 0.82 0.42
## xf1Santa Fe 0.1347 0.2206 0.61 0.54
##
## Residual standard error: 1.06 on 143 degrees of freedom
## (4 observations deleted due to missingness)
## Multiple Rsquared: 0.0144, Adjusted Rsquared: 0.000645
## Fstatistic: 1.05 on 2 and 143 DF, pvalue: 0.354
```
 Now lets provide a new summary method
```r
summary.pjgreatest < function(object, ...){
newres < NextMethod(generic = "summary", object = object, ...)
newres$Rsquare < runif(1, min = 0.90, max = 0.999)
class(newres) < c("summary.pjgreatest", class(newres))
newres
}
```
 The strong (strong!) recommendation from the R developers is to
write summary methods that return data structures, and then we write a
print method to display that structure.
```r
print.summary.pjgreatest < function(x, ...){
cat("You are now in the twilight zone. \n")
cat("All regressions fitted here are nearly perfect. \n")
cat(paste("The Rsquare is", x$Rsquare, "\n\n"))
cat(paste("That's a great Rsquare, just like you deserve! \n\n"))
cat("But if you want to believe those R guys, the R square is only", x$r.square, "\n\n")
cat("See?\n")
NextMethod(generic = "print", object = x, ...)
}
```
```r
summary(m1)
```
```
## You are now in the twilight zone.
## All regressions fitted here are nearly perfect.
## The Rsquare is 0.997020865603583
##
## That's a great Rsquare, just like you deserve!
##
## But if you want to believe those R guys, the R square is only 0.01443
##
## See?
##
## Call:
## lm(formula = y ~ xf1, data = dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## 3.417 0.673 0.041 0.744 2.468
##
## Coefficients:
## Estimate Std. Error t value Pr(>t)
## (Intercept) 0.0624 0.1543 0.40 0.69
## xf1Denver 0.1720 0.2110 0.82 0.42
## xf1Santa Fe 0.1347 0.2206 0.61 0.54
##
## Residual standard error: 1.06 on 143 degrees of freedom
## (4 observations deleted due to missingness)
## Multiple Rsquared: 0.0144, Adjusted Rsquared: 0.000645
## Fstatistic: 1.05 on 2 and 143 DF, pvalue: 0.354
```
## Generic Functions
Definition: a function that receives a request and sends it to another function
Generic usage
```
summary(m1)
```
calls a method (an R function) named summary.lm
 "Dispatching" in S3. R runtime looks at the class of the FIRST
input argument, and then sends the work to the other function.
 I learned the importance of this when working on pctable. It's possible to
 write a formula method pctable.formula, so users can run pctable(y ~ x, data = dat)
 write a character method pctable.character, so users can run pctable("y", "x", data = dat)
 but it is not possible to write a method in which the first argument is nothing, such as pctable(y, x, dat) because y does not exist and the S3 dispatcher becomes hopelessy confused.
## Generic functions almost always are simple with a minimal declaration
 "..." is a legal word. Meaning "user can add named arguments and if they don't match the ones we expect, we'll toss all of them into "...". Comments about processing "..." included in new Rchaeology.
 The generic print
```
print < function(x, ...) UseMethod("print")
```
 This preserves flexibility, puts weight of work onto methods, which
 can have more named arguments.
 must have arguments x and ...
## Generic functions
 Examples of commonly used generics
 summary, print, plot
 Methods examples
 summary.lm (creates new object with class "summary.lm")
 summary.glm
 summary.table
 R has tools to see methods that are available
 list available print methods
```
methods(print)
```
 list methods that can apply to objects of class "lm"
```
methods(class = "lm")
```
## Methods
 Method is the name for those "other functions" that actually do the work.
 Named in format "generic.class".
 If the class attribute of x is c("A","B"), then print(x) looks for
 ``` print.A ``` Not finding that, it looks for ``` print.B ```
 Not finding that, it looks for ``` print.default ```
 summary.class should always create a new object, the class of
 which is summary.class and the print method for that
 would be named print.summary.class
## print methods
```
print.table < function (x, digits = getOption("digits"),
quote = FALSE, na.print = "",
zero.print = "0", justify = "none", ...)
```
```
print.lm < function (x, digits = max(3L, getOption("digits")  3L), ...)
```
```
print.default < function(x, digits = NULL, quote = TRUE, na.print = NULL,
print.gap = NULL, right = FALSE, max = NULL,
useSource = TRUE, ...)
```
 print methods usually have to be separately created for the class and all of its derivative subclasses
 print.lm
 print.summary.lm
## ... is a three letter word
 Literally, the period is a legal character, three periods is a word!
 There's not much manual info on "...". I have found the only
solution was a careful study of the R source code. See how they do
it, then do it like they do.
 Explanation inserted into new Rchaeology vignette. Will get better
 Inside a function, we almost always either grab and inspect the dots
```
dots < list(...)
dotnames < names(dots)
```
 Often, the dots have argument that are intended for a particular
purpose. We often have to separate the arguments, by name
(dotnames).
## S3/S4
 S4 is a more elaborate object structure, more formal, less prone to mischief of the sort we just explored.
 Seems that many packges, and almost all of base R, are still written in S3.
 The fact that the 2 sorts coexist in the same runtime causes a lot of trouble for me.
 Different accessors.
 S3 uses dollar signs, dat$x same as dat[ , "x"]
 S4 uses @ as accessor.
 So function receives an object, we have to write specialized code to figure out if the thing is S3 or S4 and then use $ and @ properly.
# Building an S3 class around an Mplus Automation result
## If you've never done this, you should try it.
 Let your GRAs write functions however they want to, without enforcing any naming scheme or style. Let them accumulate a few 1000 lines.
 Then tell them that you want to hide all of that detail in a package and expose to the users only a simple set of standardlooking functions.
```
m1 < fitter(y ~ x1 + x2, data = dat)
summary(m1)
coef(m1) ## method displays brief
coef(summary(m1)) ## makes parameter table
anova(m1)
plot(m1)
```
 I say you should try that because it is fun to watch the complaining
## We began with an interace with idiosyncratic names like
```
runEverything < function(dFrame = NULL, formString = NULL,
missValue = 999, patternV = patternVector,
mScript = mplusSkel, mplusFile = 'mplus.csv',
isMissing = FALSE, numPerLine = 5, isWeighted = FALSE,
mplusDir = "../../NLTS2data/Mplus/", subPop = FALSE)
summarizeObj < function(List = NULL, r.VGAM = FALSE, dFrame = NULL)
```
## Here's what we end up with {.allowframebreaks}
```r
stages < function(formula = NULL, data = NULL,
missValue = 999, skel = mplusSkel,
mplusFile = 'mplus.csv',
isMissing = FALSE, numPerLine = 5, isWeighted = FALSE,
mplusDir = "../../NLTS2data/Mplus/", subPop = FALSE)
{
require(MplusAutomation)
if (!file.exists(mplusDir)) dir.create(mplusDir, showWarnings = FALSE, recursive = TRUE)
if (!plyr::is.formula(formula)) formula < as.formula(formula)
## Create the matrix and recoding for Mplus
mMat < makeMplusMat(formula, data, missValue)
## Make the number of stages
nstages < length( unique( mMat[mMat[, 1] != 999, 1] ) )
## Check if nstages is 2, 3, or 4
if(nstages < 2  nstages > 4){
stop('The number of stages must be 2, 3, or 4')
}
## Add in code about 2 stages aka logistic regression
if(nstages > 2){
mMat < recodeMplus(mMat, nstages, missValue)
}
## Makes the replacement vector along with the mplus Key for conversion later returns list
rList < makeReplacementVector(mplusFile = mplusFile,
data = mMat,
skel = skel,
isMissing = isMissing,
numPerLine = numPerLine,
isWeighted = isWeighted,
nstages = nstages,
missValue = as.character(missValue),
subPop = subPop)
mplusKey < rList[[2]]
outList < runMplus(data = mMat, replaceV = rList[[1]], skel, mplusDir, mplusFile)
## more direct way to rename all values in param column, misses Thresholds, however
pjkey < mplusKey[ , "predictors"]
names(pjkey) < mplusKey[ , "mplus"]
## Now beautify names of variables and columns in parameter objects
newheaders < c(pval = "Pr(>z)", est = "Estimate", se = "Std Error", est_se = "z value")
for(aname in names(outList$parameters)){
param < outList[["parameters"]][[aname]][["param"]]
outList[["parameters"]][[aname]][["param"]] <
ifelse( param %in% names(pjkey), pjkey[param], param)
## PJ. need to catch thresholds named V$x. make second pass
param < sapply(strsplit(param, "\\$"), function(xxx) xxx[1])
outList[["parameters"]][[aname]][["param"]] <
ifelse( param %in% names(pjkey), pjkey[param], param)
paramHeader < outList[["parameters"]][[aname]][["paramHeader"]]
phlist < strsplit(paramHeader, "\\.")
for(i in seq_along(phlist)){ phlist[[i]][1] < ifelse(phlist[[i]][1] %in% names(pjkey),
pjkey[phlist[[i]][1]],
phlist[[i]][1])
}
outList[["parameters"]][[aname]][["paramHeader"]] <
lapply(phlist, function(xx) paste0(xx, collapse = "."))
oldcols < colnames(outList[["parameters"]][[aname]])
colnames(outList[["parameters"]][[aname]]) < ifelse(oldcols %in% names(newheaders), newheaders[oldcols], oldcols)
}
res < list(output = outList,
mplusKey = mplusKey,
formula = formula,
mplusFile = paste0(mplusDir, "mplus.out"))
class(res) < "stages"
invisible(res)
}
summary.stages < function(object = NULL, type = "unstandardized", trimrows = TRUE, ...)
{
## Check for type as character
if(is.character(type)){
type < match.arg(tolower(type),
c('unstandardized', 'std.standardized',
'stdy.standardized', 'stdyx.standardized'))
} else {
stop(paste("type must equal unstandardized,",
"stdyx.standardized, stdy.standardized,",
"or std.standardized."))
}
formula < object$formula
coefs < object[["output"]][["parameters"]][[type]]
noise < grep("Means$Variances$.*WITH$", coefs$paramHeader)
paramHeader < coefs$paramHeader
if (trimrows & length(noise) > 0){
coefs < coefs[noise, ]
}
## only if trimrows can we tighten up predictor names
if (trimrows){
paramHeader < substr(coefs$paramHeader, 1, 2)
paramHeader < gsub("C", "", paramHeader)
}
## Make rownames
rownames(coefs) < paste0(coefs$param, ";", paramHeader)
## JH 2/4/2015
## create Th;1, ...., Th;n
charVec < paste0('^', rownames(object$mplusKey)[grep('^C\\d{1,}_', rownames(object$mplusKey))], ';.+')
newThVec < paste0('Threshold;', 1:length(charVec))
## Final rownames
rownames(coefs) < multiGsub(charVec, newThVec, rownames(coefs))
coefs$param < NULL
coefs$paramHeader < NULL
coefs < as.matrix(coefs)
res < list(coefficients = coefs, N = object[[1]]$summaries$Observations,
LL = object[[1]]$summaries$LL ,
AIC = object[[1]]$summaries$AIC,
BIC = object[[1]]$summaries$BIC,
aBIC = object[[1]]$summaries$aBIC,
AICC = object[[1]]$summaries$AICC, formula = formula)
class(res) < "summary.stages"
attr(res, "trimrows") < trimrows
attr(res, "type") < type
res
}
print.summary.stages < function(x,
digits = max(3L, getOption("digits")  3L),
signif.stars = getOption("show.signif.stars"), ....)
{
coefs < x$coefficients
cat(paste("This is", attr(x, "type") ,
"output from Mplus, reformatted similar to VGAM", "\n"))
if (attr(x, "trimrows")) cat(paste("Note: Omitting output rows for WITH,",
"Variances, Means", "\n"))
printCoefmat(coefs, digits = digits, signif.stars = signif.stars,
na.print = "NA")
cat(paste("N:" , formatC(x$N, digits = digits), "\n"))
cat(paste("LL:" , formatC(x$LL, digits = digits), "\n"))
cat(paste("AIC:" , formatC(x$AIC, digits = digits), "\n"))
cat(paste("BIC:" , formatC(x$BIC, digits = digits), "\n"))
cat(paste("aBIC:" , formatC(x$aBIC, digits = digits), "\n"))
cat(paste("AICC:" , formatC(x$AICC, digits = digits), "\n"))
invisible(coefs)
}
nobs.stages < function(object, ...){
object[[1]]$summaries$Observations
}
logLik.stages < function(object, ...){
myll < object[[1]]$summaries$LL
attr(myll, "df") < object[[1]]$summaries$Parameters
myll
}
coef.summary.stages < function(object, ...){
ggg < object$coefficients
}
coef.stages < function(object, ...){
objsum < summary(object)$coefficients
}
```
## Key elements
 Review the output from lm and glm, and output from summary.lm and summary.glm
 New class stages should use IDENTICAL names for output elements where possible
 Note that R summary methods for regressions have a standard format
for tables, and the table is accessed by coef(summary(m1))
 Methods using common R names need to be provided:
 nobs
 coef
 coef.summary
 logLik
 Needs anova method
 This "almost" (99.7%) is compatible with outreg regression table generator
 Need next: Roxygen markup to document functions
## Advice about designing interfaces
 new Rchaeology vignette
 tidbits
1. Protect your function's calculations from the user's workspace.
2. Check argument values.
3. Design the function so that it runs with a minimum number of arguments.
## Now I emphasize 3, the minimum number of arguments
 In the development process, we are often unsure which arguments should be passed around.
 Often we end up with 3 or 4 related things, "a1", "b1", "c1", "d1" and function declarations have all four of those things passed along.
 If those 4 things really are linked, they are always companions, we should more formally link them together. Can
 Put all 4 things into a list, or
 Create a1 and then add b1 c1 d1 as attributes:
```
attr(a1, "b") < b1
attr(a1, "c") < c1
attr(a1, "d") < d1
```
This eliminates the need to write myFunction(a, b, c, d) to receive 4 arguments, we instead write
```
myFunction < function (a){
b < attr(a, "b")
c < attr(a, "c")
d < attr(d, "d")
[blah blah blah]
}
```