NBA analysis using R and JavaScript (Shiny) Part 2

NBA analysis using R and JavaScript (Shiny)

The previous article outlined the basics of the Shiny software package, which enables the creation of a reactive user environment in the R. programming language. This article will focus on integrating the Shiny application into the wider context of web technologies (JS and CSS) as well as the R / Shiny features that enable reactive programming and principal component analysis (PCA). Much of the article will be devoted to interpreting the results of PCA analysis in light of the goals previously set.

Additional Javascript and CSS

One of the basic ideas of our application is combining some Javascript libraries with Shiny functions, as well as further customize the appearance of the user environment. In addition to excellent graphing libraries in R, Javascript offers animated and interactive graphical components that can more effectively highlight the results of the analysis performed. In this application, we will use Chart.js. Any JS file can be included in a Shiny application using tag functions, similar to normal HTML, within the tag head (server.R):

tags $ head (
     tags $ link (rel = “stylesheet”, type = “text / css”, href = “bootstrap.css”),
     tags $ script (src = “https://code.jquery.com/jquery-3.2.1.js”),
     tags $ script (src = “chartUI.js”)
   ),

The paths to .css and .js files are part of the Shiny convention according to which all “web” files are placed in the www directory.

Changing the look of an application is accomplished through .css files, and especially through a bootstrap.css file that may but may not be present. The best approach to controlling the look of the entire application is achieved using the ready-made bootstrap styles provided by the Bootswatch page. Just select a theme, download bootstrap.css and save it to the www directory.

The Javascript in the www directory (chartUI.js files, scatterChartConfig.js, and radarChartConfig.js) contains functions that are responsible for plotting graphs but also for receiving R / Shiny data.

R functions, reactive programming, and Shiny-JS communication

When launching an application, a .css file is automatically loaded with all the data that will be used in the analysis. Similarly, PCA analysis is immediately performed, while the users are presented with the results of two graphical representations, scatter graphs and principal component contribution graphs.

Additional R functions, especially those that contain analytical logic, can be part of independent R files (engine.R) that can be included in the server.R using the source function (#file paths).

Users are provided with two types of results control, two drop-down menus to select the main components (PCA axes) displayed in the scatter and contribution graphs, as well as the statistic type, team or opponent. In order for the display to be updated automatically in accordance with user actions, it is necessary to take advantage of the reactive components and features that Shiny offers. Reactive components and functions can be divided into two groups: 1. Implicit – which includes all the variables that are automatically generated for each active user environment element, such as buttons, drop-down menus, or text boxes. Shiny uses a simple convention for these elements by tying the element names specified by the developer to the global input variable. So e.g. typing in a text field that is defined as textInput (“title”, …) results in the response of the variable input $ title, whose value becomes the value entered by the user. 2. Explicit – which are manually declared in the code (reactive ({})), and represent the wrapper of standard R functions, so that they are executed without an explicit call if any reactive variable that changes in within them benefits. These functions must also be explicitly “observed” using functions of type observe () or observeEvent ().

Thus, a combination of reactive variables can construct a tree of reactive functions that are cascaded by invocation if any of the variables are changed. In the case of our application, the basic reactive part is:

pcaList <- reactive ({
    performPCAwrap (
     getInternalData (),
     data.frame (fpc = extractNum (input $ first_pc),
     spc = extractNum (input $ second_pc)), input $ stats)
  })

  observeEvent (pcaList (), {
      session $ sendCustomMessage (type = “jsondata”, toJSON (pcaList () $ scores))
      session $ sendCustomMessage (type = “jsoninfo”, toJSON (pcaList () $ pcainfo))
    }
  )

The reactive pcaList function is executed when the application is started, as well as when changing the implicit variables input $ first_pc, input $ second_pc, and input $ stats that come from pull-down menus and control (radio) buttons. performPCAwrap is a function that drives PCA analysis and returns results. The three parameters of this function are the input data obtained by the function getInternalData (), an R data.frame structure that carries information about the selected major components and a variable of statistics type. pcaList is monitored through observeEvent so that any change in internal parameters causes the session functions to start. Finally, these functions send JSON results according to Javascript functions in chartUI.js.

All session data within Javascript can be read using the Shiny namespace variable through the addCustomMessageHandler method, making the data available for further processing and display.

Shiny.addCustomMessageHandler (“jsondata”,
     function (jsondata) {
         // further processing of data and sending according to graphics
     }
)

Principal components analysis

The engine.R file contains functions to load data (getInternalData ()) and to perform the analysis itself (performPCA ()). In the R programming language, there are a number of functions that can be used to perform PCA analysis, but we will use the prcomp () function, which is part of the default R stats package. The most important part of the code is in the performPCA function:

performPCA <- function (ogd, pcaChoice) {
   pcaTeam <- prcomp (ogd, scale. = TRUE)
   teamOrder <- match (team, rownames (pcaTeam $ x))
   pcaScores <- pcaTeam $ x [teamOrder,]
   pcaLoadings <- pcaTeam $ rotation
   pcaVarianceExplained <- round (summary (pcaTeam) $ importance [2,], 2)
   pcaCorrelations <- as.data.frame (cor (ogd [teamOrder,], pcaScores))
   …

The prcomp () function receives data that is of type R data.frame, and has a scale of parameters. In the case of our analysis, scaling the input data before analysis is necessary to cancel out the impact of different types of variables (different variance and absolute values) on the overall variability of the data. After performing the analysis, the pcaTeam variable contains a named list of results, which are then separated into different variables, to help prepare them for conversion to JSON and sending them to graphs in Javascript functions. pcaScores represents data.frames with 17 columns and 30 rows (for each team), with ratings from each team on one of the 17 major components. This data is plotted on a scatter plot. The impact, that is, the percentage of data variability captured for each major component is contained by the pcaVarianceExplained variable. pcaCorrelations contains correlation coefficients between the original data and the data represented by the principal components (pcaScores), within a data.frame structure with 17 rows and 17 columns. The values ​​of these coefficients range between -1 and 1, where 0 indicates a complete lack of influence of the variable on the team score on the main component. Higher values ​​(either negative or positive) are interpreted as a degree of influence, and variables that have scores close to 1 or -1 represent those most influential.

Likes:
9 0
Views:
1382
Article Categories:
PROGRAMMINGTECHNOLOGY

Leave a Reply

Your email address will not be published. Required fields are marked *