The Effects of Task Complexity and Skill On Over/under-Estimation of Internal Control [PDF]

  • 0 0 0
  • Suka dengan makalah ini dan mengunduhnya? Anda bisa menerbitkan file PDF Anda sendiri secara online secara gratis dalam beberapa menit saja! Sign Up
File loading please wait...
Citation preview

The current issue and full text archive of this journal is available at www.emeraldinsight.com/0268-6902.htm



MAJ 25,8



The effects of task complexity and skill on over/under-estimation of internal control



734



Maureen Francis Mascha



Received 2 November 2009 Reviewed 9 March 2010 Accepted 22 May 2010



Marquette University, Milwaukee, Wisconsin, USA, and



Cathleen L. Miller Wayne State University, Detroit, Michigan, USA Abstract Purpose – Using Bonner’s model, this paper aims to examine the effects of skill level and the two task complexity dimensions – clarity and quantity of information – on subjects’ internal control risk assessments. Design/methodology/approach – The research expands the literature by isolating the individual components of task complexity and examining how skill interacts with either component in affecting decision making. It uses a 2 £ 4 mixed factors laboratory experiment. This design allows the effects of task complexity and skill between and within subjects to be examined. The mixed factors design includes two levels (high and low) for each dimension of task complexity (clarity and quantity) on a between-subjects basis, with four separate cases on a within-subjects basis. Skill level is measured as the subject’s task-related knowledge. The subjects are senior-level students in auditing courses at three Midwest universities. Findings – It is found that subjects assess control risk too high, consistent with the conservatism principle. Skill level mediates this finding: high-skill subjects make more accurate risk assessments; low-skill subjects consistently assess control risk too high. Over repetitions of complex tasks, high-skill subjects make more accurate assessments, while low-skill subjects initially overstate, then improve. For repetitive simple tasks, both skill levels get worse, increasingly overstating their assessments. These findings support current practice, indicating that experienced auditors make complex risk assessments, where repetitive performance of complex tasks improves risk assessments. However, repetitive simple tasks may result in assessing control risk too high, resulting in excessive testing. Originality/value – Consistent with prior research, the results suggest task complexity and subject skill are important considerations in experimental research designs. Keywords Skills, Decision making, Task analysis, Auditing Paper type Research paper



Managerial Auditing Journal Vol. 25 No. 8, 2010 pp. 734-755 q Emerald Group Publishing Limited 0268-6902 DOI 10.1108/02686901011069533



I. Introduction This study examines factors that affect the auditor’s risk assessment process, a process the Public Companies Accounting Oversight Board (PCAOB) has identified as needing improvement. In its annual inspection reports to congress[1] the PCAOB has been concerned that auditors do not adjust audit procedures or do not respond appropriately to risks identified in the audit. It states that: [. . .] PCAOB inspection teams have, in some cases, observed that auditors failed to expand audit procedures when addressing identified fraud risk factors. In those cases, it appeared



that auditors might be performing the procedures required in AU § 316 mechanically, without using those procedures to develop insights on the risk [. . .] PCAOB inspection teams have observed instances of auditors failing to respond appropriately to identified fraud risk factors [. . .] (Excerpts from PCAOB, 2007a, Inspections, Release No. 2007-001, January 22).



These excerpts indicate that merely identifying risk factors is not sufficient; auditors must demonstrate that they understand the implications of these factors and document results of tests performed to detect if those implications ultimately affect the financial statements (PCAOB, 2007b, Standards, AS5). Auditing Standard No. 5 (AS5) highlights the importance of internal control risk assessment in auditing public companies’ financial statements (PCAOB, 2007b, Standards, AS5). According to this standard, auditors must first thoroughly understand the entity’s internal control environment in order to identify internal control risks and deficiencies. Second, auditors must assess whether the identified risks and/or deficiencies are likely to occur and the extent to which they affect the internal controls over financial reporting. Third, auditors must design appropriate tests to determine if the internal controls exist and work as identified (PCAOB, 2007b, Standards, AS5). According to the PCAOB review reports, auditors appear to be able to identify the risks and/or deficiencies but struggle with the last two steps of this risk assessment process (PCAOB, 2007a, Inspections, Release No. 2007-001, January 22). An auditor’s internal control risk assessment can result in over- or understated assessments of risk, both of which result in negative consequences for the auditor. Overstatement error results in over testing and potentially non-compensated audit work; understatement error can result in audit failure because of over reliance on internal controls, leading to reduced testing, and ultimately resulting in failure to detect material misstatements (Bierstaker and Brody, 2001). Even if an understatement error does not lead to outright audit failure, it can lead to future restatements of financial statements (Wright and Wright, 1996). Since over- and understated assessments of control risk can lead to costly outcomes, audit firms spend considerable time and money to improve auditors’ assessments of internal controls. Therefore, understanding potential causes of the over/understatement is important from a practice perspective. Educators wishing to improve the education of the next generation of auditors can also benefit from a better understanding of the factors associated with risk assessment. This study examines two factors affecting the risk assessment process: task complexity and auditor’s task-related skill level. Prior research suggests that task complexity can affect auditor judgments and accuracy (Anderson, 1983; Abdolmohammadi and Wright, 1987; Bonner and Lewis, 1990; Bonner, 1994). This study uses Bonner’s (1994) model to examine the effects of the two dimensions of task complexity – clarity of information and volume (or quantity) of information – and the interactive effects of task complexity and auditor’s task-related skill level on auditors’ assessments of internal control risk. We perform a mixed factors laboratory experiment with task complexity varied between subjects and assessment of control risk measured as a within-subjects variable. Task-related skill is measured using the subject’s first control risk assessment for assignment to high- and low-skill groups. Since we are testing theory, we use senior-level accounting students enrolled in the undergraduate auditing course from three different universities. Previous research considers senior-level students as appropriate surrogates for novice auditors (Ashton and Kramer, 1980).



The effects of task complexity 735



MAJ 25,8



736



Generally, we find support for Bonner’s model: task complexity negatively affects judgment performance. Although both clarity of information and quantity of information are important in determining task complexity, clarity of information appears to dominate the negative effects on judgment performance. Skill level is also important in interpreting the experimental results, mediating judgment performance. Both high- and low-skill subjects improve (impair) their risk assessment judgments in complex (simple) conditions the more they perform such risk assessments. Apparently, when faced with several, simple risk assessment judgments, auditors revert to conservatism, assessing control risk too high. The remainder of the paper is organized as follows. Part II discusses the literature and develops the hypotheses. Part III describes the experimental design and methodology, and Part IV presents the findings, implications, and directions for future research. II. Literature and hypotheses Bonner’s model Bonner (1994) extensively reviewed the decision-making literature to isolate reasons for the disparate, often contradictory findings. Her model characterizes the judgment and decision-making process as three phases – input, processing, and output. The decision maker encounters the information needed to make the decision in the first phase; performs the cognitive work necessary for making a decision in the processing phase; and communicates the decision in the output phase. Since the three phases link serially, information introduced during the preceding phase(s) affects the following phase(s). The model applies in each phase and states that judgment performance is a function of task complexity, skill, and motivation, depicted as follows:   task complexity : Judgement performance ¼ 2f skill; motivation Judgment performance is the person’s decision in a judgment task, where these decisions are considered “better or worse.” Task complexity is composed of two dimensions: clarity of information and volume (hereafter referred to as “quantity”) of information. The level of task complexity is a combination of these two dimensions; however, the model does not identify the individual dimension effects on judgment performance. Skill is defined two ways: (1) a person’s level of task-specific knowledge; or (2) a person’s experience with the task, measured over time (e.g. years of audit experience). Finally, motivation is the level of the person’s interest in performing the task. The model shows that task complexity has a direct, inverse relationship on judgment performance, such that as tasks become more complex, judgment performance worsens. However, the decision maker’s skill level and motivation mediate this relationship by lessening the negative effect of task complexity on judgment performance. Essentially, the decline in judgment performance is less for higher skilled (more motivated) decision makers than for lower skilled (less motivated) decision makers[2].



Auditor judgment and task complexity Bonner’s (1994) model expresses judgment performance in terms of “better or worse”; however, of more concern in auditing is whether auditor judgments are over- or understated, where the direction results in different consequences. One auditor judgment particularly sensitive to over- or understatement with significantly different consequences is risk assessment. Since risk assessment primarily occurs at the beginning of an audit (Ricchiute, 2006), we use Bonner’s task complexity definitions for the input phase of the decision-making process. Task clarity is the degree to which information cues are consistent with each other and with information stored in memory. When cues are consistent (inconsistent), tasks are clear (unclear). Quantity of information is the number of alternatives, number of cues, or number of procedures to process in the task. When the number of cues is low (high), the task is simple (complex). Therefore, the most simple (complex) task exists when clarity is high (low) and quantity is low (high). Hypotheses Bonner’s (1994) model predicts that complex tasks result in poor judgment performance. In auditing, poor judgment performance results when either over- or understatement of risk assessment occurs. Since Bonner’s model does not directly predict over- or understatement judgments, we turn to cognitive load theory and the conservatism principle (Watts, 2003a, b) for deriving directional hypotheses. Cognitive load theory asserts people resort to heuristics or biases when working memory is challenged, and information is not properly processed (Rose and Wolfe, 2000; Sweller and Chandler, 1991). Thus, when tasks are complex, auditors cannot properly process the information, so they are more likely to turn to their conservative training. Conservatism predicts that, when given a choice, auditors prefer to perform more testing than less testing, since the costs associated with audit failure are so great (Chaney, 2002). Therefore, conservatism indicates auditors are more likely to assess internal control risk too high rather than too low in order to reduce the costs of errors. Furthermore, since complex tasks often involve a large degree of judgment (e.g. auditing contingencies), auditors are more likely to use conservatism when performing complex tasks, even when good internal controls may exist (Wright and Wright, 1996). In sum, for complex tasks, Bonner’s (1994) model predicts auditor judgment declines, and cognitive load theory predicts auditors resort to conservatism, resulting in overstated internal control risk assessments. Therefore, this study examines the following hypothesis, stated in the alternate form, for complex tasks: H1a. Auditors performing a complex task will overstate their judgments of risk assessment. Bonner’s (1994) model predicts that simple tasks result in better judgment performance. Simple tasks allow for better processing of information, and, therefore, should result in more accurate risk assessment judgments, i.e. no significant over- or understatement of risk assessment. However, simple tasks may result in understatements of risk assessment. When tasks are low in complexity, (e.g. “simple”) the amount of effort required and the difficulty of assessing risk is less than that for complex tasks, irrespective of how “good” or “bad” internal controls are in these audit areas. Therefore, auditors performing simple



The effects of task complexity 737



MAJ 25,8



738



judgment tasks are more likely to understate their judgments of risk assessment, because of lack of proper attention paid or the onset of boredom, especially if the auditor becomes too confident in his knowledge (Mascha and Smedley, 2007). Since theory does not support a specific direction for auditors’ risk assessment judgments for simple tasks, we test the following null hypothesis. The dashed, red line in Figure 1 shows the predictions for H1a and H1b: H1b. Auditors performing a simple task will not understate or overstate their judgments of risk assessment. Bonner’s (1994) model does not address the individual dimensions of task complexity – clarity and quantity of information – and their effects on judgment performance. Tasks are seldom totally complex (low clarity and high quantity) or totally simple (high clarity and low quantity) so we examine the two dimensions (clarity and quantity) separately, extending Bonner’s model. Cognitive load research finds task complexity affects judgment performance through its effect on working memory (Sweller and Chandler, 1991) and information processing (Anderson, 1983). Since working memory is notoriously limited in capacity and duration (Sweller and Chandler, 1991; Rose and Wolfe, 2000; and Anderson, 1983), taxing either one causes detrimental effects on information processing, resulting in less decision accuracy (Rose and Wolfe, 2000). Using consistency of information cues as definition of clarity in task complexity (Bonner, 1994), tasks with more inconsistent cues, cues that contradict one another, cues that do not agree with cues stored in memory, cues that are irrelevant, etc. are more complex. When processing these inconsistent cues, working memory becomes challenged and finally overwhelmed. At this point, the individual resorts to learned heuristics to relieve the stress in working memory (Rose and Wolfe, 2000). Most auditors are likely to use conservatism as their heuristic (Watts, 2003a, b), assessing control risk too high (i.e. overstated risk assessment) to prevent potentially significant audit losses (Chaney, 2002). Similarly, research supports the premise that as the amount of information increases, working memory becomes overloaded, inducing fatigue, causing an increase in errors (Rose and Wolfe, 2000; Iselin, 1988). As information quantity increases, H1a



+



Risk assessment



H1a



Figure 1. H1a and H1b – hypothesized and actual results



H1b H1b



0



Hypothesized Actual



– Complex



Simple Task level



the judgment process requires additional processing; resulting in loss of working memory and poorer judgment performance (Rose and Wolfe, 2000). In sum, tasks low in clarity or high in quantity increase cognitive load through their increased demands on working memory and processing, thus not all pertinent information is properly considered, causing decision accuracy to suffer. Additionally, the conservatism principle predicts auditors are more likely to emphasize “bad” versus “good” information, particularly if the auditor is uncertain about the task or risk (Basru, 1997). Therefore, auditors performing tasks that are low in clarity or high in quantity of information are more likely to focus on poor internal controls instead of those operating effectively and are more likely to make overstated risk assessment judgments. Therefore, this study examines the following hypotheses, stated in the alternate form: H2a. Auditors performing tasks low in clarity will overstate their judgments of risk assessment. H2b. Auditors performing tasks high in quantity (i.e. large amounts of information) will overstate their judgments of risk assessment. Conversely, when information cues are consistent with each other and/or within memory (i.e. high clarity) or only the relevant amount of information is provided (i.e. low quantity), working memory is not heavily taxed, allowing for better processing of all available information (Sweller and Chandler, 1991). Since information is processed more effortlessly, with less demand on working memory, auditor risk assessments should be more accurate (Bonner, 1994). On the other hand, if the information is too easily processed, such as when the information cues are very consistent with each other, potentially forming a consistent pattern or the quantity of the information is not more than necessary, then auditors may ignore cues and information they would otherwise consider, resulting in less decision accuracy. Unlike when the task is unclear or the quantity of information is more than needed, the auditor is less likely to engage the conservatism principle since the task is not uncertain (Basru, 1997). Therefore, the auditor is less likely to overstate risk assessment judgments. However, because the task is higher in clarity or lower in quantity of information, auditors are more likely to be more confident in their risk assessment judgments. Since they are more confident, auditors are more likely to understate their risk assessment judgments (Mascha and Smedley, 2007; Mascha, 2001). Based on this recent research, this study examines the following hypotheses, stated in the alternate form: H3a. Auditors performing tasks high in clarity will understate their judgments of risk assessment. H3b. Auditors performing tasks low in quantity will understate their judgments of risk assessment. Skill and task complexity Cognitive load theory research finds that subjects who are more familiar with a task integrate and process information more effectively because their task-specific knowledge reduces cognitive effort and the negative impact on working memory (Anderson, 1983;



The effects of task complexity 739



MAJ 25,8



740



Sweller and Chandler, 1991; Rose and Wolfe, 2000). This literature suggests that auditors possessing high levels of task-specific knowledge should integrate and process information to make more accurate risk assessments than auditors possessing low levels of task-specific knowledge (Bonner, 1994). Conversely, research suggests that auditors lacking expertise or skill often fail to weight clues properly, especially inconsistent ones, when forming evaluations of internal control risk (Abdolmohammadi and Wright, 1987; Bonner, 1994). This lack of ability in deriving an overall assessment from a list of possibly contradictory cues often leads beginning auditors to overestimate internal control risk (Bonner, 1994). From this research, we note that inexperienced auditors most likely overstate their internal control risk assessments, and, as task-specific knowledge (i.e. skill) increases, auditors’ overstatement of risk assessment decreases. We test for this main effect for skill in the following hypothesis. The dashed line in Figure 2 shows the following hypothesis predictions for high- and low-skill auditors: H4a. High-skill auditors will make risk assessments that are not different from zero, i.e. not over- or understated; whereas the low-skill auditors will make risk assessments that are different from zero and overstated. More importantly, Bonner’s (1994) model asserts that skill mediates task complexity effects on judgment performance. Defining “skill” as a person’s level of task-specific knowledge, her model proposes that as task complexity increases, the change in judgment performance is smaller for high-skill auditors than for low-skill auditors. In other words, the risk assessments made by a high-skilled auditor differ only a small amount between simple and complex tasks; however, the risk assessments made by a low-skilled auditor differ greatly between simple and complex tasks. In Figure 3, the red line shows the change in risk assessment values between simple and complex tasks for high-skilled auditors, while the yellow line shows the change in risk assessment values for low-skilled auditors. The green line shows the change in risk assessment values on average. Figure 3 graphically shows the following hypothesis: H4b. As task complexity increases, the change in risk assessments for high-skill auditors is less than the change in risk assessments for low-skill auditors.



Risk assessment



+



Figure 2. H4a – hypothesized and actual results



0 Hypothesized Actual – Simple



Complex Skill level



Risk assessment



+



The effects of task complexity



Low



High High 0



741 Low



Hypothesized Actual



– Complex



Simple Task level



III. Experimental design Experimental design and task We use a mixed factor experimental design to collect the data[3]. This design allows us to examine the effects of task complexity and skill between and within subjects. H1-H3 are directional tests from the central point of zero; therefore, we perform significance tests on the difference of the subjects’ mean responses from zero. For H4a and H4b, we perform repeated measures analysis of variance tests to evaluate the effects of task complexity and skill on judgment performance over more than one setting, as is done in practice when auditors plan more than one audit at a time. The mixed factors design includes two levels (high and low) for each dimension of task complexity (clarity and quantity) on a between-subjects basis, with four cases on a within-subjects basis. This combination yields four between-subjects cells with four within-subjects cells. Skill level is a measure of the subject’s task-related knowledge. This covariate measure initially equals the subject’s case 1 internal control evaluation score. Consistent with prior studies examining internal control evaluations, we select an internal control review of the payroll cycle. This task is frequently assigned to beginning auditors (Ashton and Kramer, 1980) and is an internal control cycle that is generally familiar to our subjects. We recruited subjects at three Midwestern state universities by offering bonus points for participation[4]. Subjects completed all experimental activities under the direct supervision of one of the authors[5]. Subjects were senior-level accounting majors enrolled in the undergraduate auditing course because these students will be the most likely to evaluate internal controls, especially payroll controls. Several of the students also had internships in public accounting firms prior to this course. Experimental sequence Upon arrival, students complete the consent form and a demographic questionnaire designed to measure prior experiences as well as course work. They begin the experiment by reading a brief description concerning the background and circumstances surrounding the payroll environment in general. Next, subjects review a specific payroll case scenario and provide internal control evaluations for five dimensions of the payroll cycle, as well as an overall control evaluation. Subjects complete four payroll cases in the experimental task. The first case represents the



Figure 3. H4b – hypothesized and actual results



MAJ 25,8



742



student’s pre-task knowledge; the remaining three cases represent repeated measures of the dependent variable designed to measure within-subject effects. We randomly assigned subjects into one of four task complexity treatments such that all cases presented to each subject remain consistent as to level of complexity (i.e. subjects in the simple treatment see four separate cases of simple complexity). While the complexity level remains the same, the cases are not identical. Subjects evaluated the internal control environments described in four sequential cases; each case was evaluated individually, without reference to the other cases. The subjects conclude the experiment by completing a brief post-experimental questionnaire designed to measure attitudes toward the tasks. For consistency and comparison, the experimental materials are from Eining and Dorr’s (1991) expert system study. Each case presents descriptions of internal payroll controls for five dimensions – initial hiring and termination, recording of time worked, calculation and preparation of payroll, payment and distribution of wages and other controls. The subjects provide quantitative evaluations (from 0 ¼ absence of all controls to 100 ¼ all controls are present and working as designed) for each of the five dimension sections as well as an overall evaluation of payroll internal control. Task complexity We manipulate task clarity by varying the consistency of internal controls described in the cases. High clarity exists when all cues provide a similar signal regarding the state of the internal control environment, e.g. all strong internal control cues or all weak internal control cues. Conversely, low clarity is a mix of cues such that no clear signal (i.e. strong internal control or weak internal control) can be determined. This manipulation is identical to the one successfully used by Mascha (2001). We manipulate quantity (i.e. amount of information) by varying the number of cues provided to the subject. The high quantity subjects receive 12 cues, whereas the low quantity subjects receive five cues. The cues provide both relevant information (e.g. the personnel department approves all new hires) and irrelevant information (e.g. payday is every Tuesday). Skill definition Bonner (1994) stipulates that skill is primarily a function of task-related knowledge, i.e. the knowledge and abilities related to a specific task, not knowledge or abilities in general. Since recent literature supports Bonner’s definition (Tan and Kao, 1999; Bierstaker and Brody, 2001; Tan et al., 2002), we adopt this definition as well. We use the difference between the subject’s score and the expert system’s score of the overall evaluation of internal control over payroll for the first case to measure the subject’s skill level (i.e. task-related knowledge). Scores closer to the expert system scores, i.e. closer to zero, are considered “high-skill,” whereas scores further from the expert system scores, i.e. further from zero in either direction, are considered “low-skill.” Since the skill measure can be either positive or negative and better scores are closer to zero, a continuous measure for skill is not appropriate[6]. Based on the distribution of the scores for skill, a median split is also not appropriate[7]; therefore, we chose to divide the subjects into high- and low-skill groups based on a 10 percent range around zero, the highest skill score. We categorize the “high” skill level subjects as those whose scores are between positive 10 and negative 10 (i.e. 2 10 # X # 10, where X is the



subject’s difference score). We categorize all scores greater than positive 10 or less than negative 10 as “low” skill level subjects. This procedure results in 48 high-skill subjects and 83 low-skill subjects. Dependent variable The dependent variable is the subject’s internal control risk assessment for the payroll cycle (i.e. risk assessment), measured as the difference between the subject’s overall evaluation of internal control and the overall evaluation determined by an expert system. This measure is consistent with prior research examining decision accuracy for judgment tasks (Libby and Libby, 1989; Eining and Dorr, 1991; Odom and Dorr, 1995; Mascha, 2001). The range for the risk assessment scores is potentially 2 100 to þ 100. A negative score means the subject evaluated the payroll internal controls lower than the expert system score (i.e. understated risk assessment), equivalent to an auditor assessing control risk too low. A positive score means the subject evaluated the payroll internal controls higher than the expert system score (i.e. overstated risk assessment), equivalent to an auditor assessing control risk too high. IV. Analyses and results Demographics A total of 141 students participated in the experiment. We exclude ten subjects from the analyses because of missing data, resulting in a total sample of 131. About 32 have prior payroll accounting work experience, and six have prior internal control review experience. Every treatment group contains students with internal control and/or payroll accounting work experience. The final sample comprises 70 males (53 percent) and 61 females (47 percent). x 2-tests indicate no significant differences (at the 0.05 level) between treatment groups for experience and gender. Table I shows the number of subjects, average age, average GPA overall and by major, average credit hours and number of subjects with internal control and payroll accounting experience by treatment group. Analysis of variance or x 2-tests show no significant differences (at the 0.05 level) between treatments on these demographic variables, suggesting successful randomization (Kerlinger, 1989). Table I also shows some notable characteristics about this subject group. First, the average age is approximately 25, because subjects at two of the institutions have older students returning to school and part-time students. The average overall and major GPAs are quite stable at 3.25 or a Bþ . Hypotheses testing: task complexity and its dimensions For H1-H3, we test the mean risk assessments for significance from zero because the means represent the difference between the subject’s score and the expert system score. Scores closest to zero, either positive or negative, are more “accurate” while scores further from zero are overstated judgments when positive and understated judgments when negative. H1a and H1b predict task complexity (simple versus complex) significantly affects auditors’ risk assessment judgments. A repeated measures analysis of variance test, Panel A, Table II, shows that task complexity significantly affects subjects’ risk assessment judgments (using Wilks’ Lamda: F ¼ 40.30, p-value , 0.0001). Therefore, we test the



The effects of task complexity 743



Table I. Average age, GPA, credit hours and experience by treatment group



34 31 34 32 131



No. of subjects 25 25 28 24 25.6



Average age (yrs) 3.25 3.26 3.25 3.26 3.25



3.27 3.31 3.16 3.29 3.25



Average major GPA (4.00 ¼ A) 114 113 120 111 114



Average credit hours



4 1 4 2 11



30 30 30 30 120



Exp. with internal controls? Yes No



744 10 8 6 7 31



24 23 28 25 100



Exp. with payroll accounting? Yes No



Notes: aGroup 1 – simple consists of low quantity and high clarity of information; bgroup 2 – mixed consists of high quantity and high clarity of information; cgroup 3 – mixed consists of low quantity and low clarity of information; dgroup 4 – complex consists of high quantity and low clarity of information; ANOVA and x 2-tests show no significant differences between treatment groups on the above demographic variables



1. Simplea 2. Mixedb 3. Mixedc 4. Complexd Total



Group



Average overall GPA (4.00 ¼ A)



MAJ 25,8



Panel A: multivariate analysis of complexity – complex versus simple Source Wilks’ Lamda Num df Den df Multivariate F Significance Case 0.614565 2 62 19.44 0.0001 Complexity 0.434780 2 62 40.30 0.0001 Panel B: H1a and H1b – complex tasks HA ¼ judgments of risk assessment will be significantly overstated. Simple tasks: Ho ¼ judgments of risk assessment will not be significantly over- or understated Number of subjects 31 34 Complex Simple Cases Mean p-value Mean p-value Case 1 20.42 0.0001 7.21 0.0102 Case 2 19.29 0.0001 21.76 0.5805 Case 3 2 14.39 0.0001 7.26 0.0152 Case 4 2 4.52 0.1139 11.88 0.0001 Notes: The repeated measures analysis of variance test in Panel A shows that task complexity significantly affects subjects’ risk assessment judgments; means and test for significance from zero: Panel B shows the results of testing the mean risk assessments for significant differences from zero; italicized values represent significant difference from zero at 0.05 alpha level



mean risk assessments for significance from zero to test for over- or understatements of risk assessment. H1a predicts that auditors significantly overstate their risk assessment judgments when performing complex tasks. Table II, Panel B, shows that H1a is partly supported. As predicted, subjects significantly overstate their risk assessment judgments for payroll internal controls in cases 1 and 2 ( p ¼ 0.0001 for both). However, subjects significantly understate their risk assessments for case 3 ( p ¼ 0.0001), and for case 4, their risk assessments are not significantly different from zero ( p ¼ 0.1139), indicating no significant difference from the expert system risk assessments. In summary, subjects either overstated their risk assessment judgments or made risk assessment judgments in line with the expert system. H1b presents the null hypothesis that auditors do not significantly over- or understate their risk assessment judgments of internal controls under simple task conditions. Table II, Panel B, shows that for three of the four cases, subjects significantly overstate their risk assessment judgments of payroll internal controls (case 1: p ¼ 0.0102; case 3: p ¼ 0.0152; case 4: p ¼ 0.0001). Only for case 2 did the subjects’ risk assessments not differ significantly from the expert system’s risk assessments ( p ¼ 0.5805). This result supports suggest that auditors rely on their conservatism training when making risk assessment judgments in a simple task condition. Therefore, we reject the null hypothesis H1b. When presented with simple tasks, auditors overstate their risk assessment judgments of payroll internal controls. The solid red line in Figure 1 shows the results for H1a and H1b. H2 and H3 examine the individual dimensions of task complexity as defined in Bonner’s (1994) model. A repeated measures analysis of variance test, Tables III and IV, Panel A, shows that clarity and quantity significantly affect subjects’ risk assessment judgments (using Wilks’ Lamda: F ¼ 23.62, p-value , 0.0001 for clarity; F ¼ 16.66, p-value , 0.0001 for quantity). Therefore, we test the mean risk assessments for significance from zero to test for over- or understatements of risk assessment.



The effects of task complexity 745



Table II. Risk assessment judgment means and significance tests (tests of H1)



MAJ 25,8



746



Table III. Risk assessment judgment means and significance tests (test of H2)



Panel A: multivariate analysis of clarity (high versus low) Source Wilks’ Lamda Num df Den df Multivariate F Significance Case 0.623861 3 126 25.32 0.0001 Clarity 0.640048 3 126 23.62 0.0001 Panel B: H2a and H2b – HA ¼ judgments of risk assessment will be significantly overstated 65 63 Number of subjects Low clarity High quantity Cases Mean p-value Mean p-value Case 1 21.13 0.0001 14.93 0.0001 Case 2 8.19 0.0008 9.55 0.0001 Case 3 3.16 0.1868 23.66 0.1316 Case 4 2 4.57 0.0215 0.32 0.8736 Notes: The repeated measures analysis of variance test shows that clarity significantly affects subjects’ risk assessment judgments; means and test for significance from zero: Panel B shows the results of testing the mean risk assessments for significant differences from zero; univariate test for within subject effects: low versus high clarity: F-value ¼ 23.64, p-value ¼ 0.0001; italicized values represent significant difference from zero at 0.05 alpha level



Panel A: multivariate analysis of quantity (high versus low) Source Wilks’ Lamda Num df Den df Multivariate F Significance Case 0.623861 3 126 25.32 0.0001 Quantity 0.716030 3 126 16.66 0.0001 Panel B: H3a and H3b – Ho ¼ judgments of risk assessment will not be significantly over- or understated 66 68 Number of subjects High clarity Low quantity Cases Mean p-value Mean p-value Case 1 8.28 0.0001 14.49 0.0001 Case 2 20.74 0.7530 2 2.10 0.3675 Case 3 6.77 0.0048 13.59 0.0001 Case 4 8.59 0.0001 3.71 0.0556 Table IV. Table of risk assessment judgment means and significance tests (test of H3)



Notes: The repeated measures analysis of variance test shows that clarity and quantity significantly affect subjects’ risk assessment judgments; means and test for significance from zero: Panel B shows the results of testing the mean risk assessments for significant differences from zero; univariate test for within subject effects: low versus high quantity: F-value ¼ 23.71, p-value ¼ 0.0001; italicized values represent significant difference from zero at 0.05 alpha level; bold values represent significant difference from zero at 0.10 alpha level



Tables III and IV, Panel B, display the means, significance values, and number of subjects for clarity and quantity dimensions for the four cases of payroll internal controls risk assessments. H2a and H2b predict that subjects will significantly overstate risk assessments for tasks lacking in clarity or for tasks high in quantity of information. As noted in Table III, Panel B, these hypotheses are partly supported-subjects significantly overstate their risk assessment judgments for payroll controls for cases 1 and 2 ( p-values , 0.0008), both low clarity and high quantity conditions. However, except for case 4, low clarity condition, subjects’ risk assessments are not significantly



different from zero ( p-values . 0.1316), thus not significantly different from the expert system’s risk assessments. Table IV, Panel B, summarizes the results for testing H3a and H3b. These hypotheses state that subjects’ risk assessments for tasks high in clarity or low in quantity will be significantly understated. The results in Table IV, Panel B, show that for cases 1, 3 and 4, subjects significantly overstate their risk assessments for high clarity and low quantity dimensions, rejecting H3a and H3b. For case 2, subjects’ risk assessments were not significantly different from the expert system’s risks assessments under both high clarity and low quantity dimensions ( p ¼ 0.7530; p ¼ 0.3675). Although case 2 results support Bonner’s (1994) model that high clarity or low quantity of information results in more accurate decisions, the preponderance of evidence for cases 1, 3, and 4 support the use of conservatism, rather than over confidence in their decisions, in rejecting H3a and H3b. Subjects receiving either clear or concise information will overstate their risk assessments of payroll internal controls.



The effects of task complexity 747



Hypotheses testing: task complexity and skill H4a and H4b suggest high- and low-skill auditors make significantly different internal control risk assessments. The repeated measures analysis of variance test, Table V, Panel A, finds a significant effect for skill and for the interaction of skill with task complexity (using Wilks’ Lamda: F ¼ 4.17 p-value , 0.0001 for skill; F ¼ 3.60 p value ¼ 0.0019 for interaction). Therefore, we test the mean risk assessments for significance from zero to test for over- or understatements of risk assessment. Since the first case in the task is used to measure the subject’s level of skill, only cases 2-4 are included. Table V, Panel B, summarizes the effects of skill on risk assessment judgments. For all three cases, 2-4, the high-skill subjects’ risk assessments are not significantly different from zero ( p-values . 0.1785), i.e. are closer to the expert system risk assessments;



Panel A: multivariate analysis of complexity and skill Source Wilks’ Lamda Num df Den df Multivariate F Significance Case 0.996619 2 122 0.21 0.8133 Complexity 0.433065 6 244 21.13 0.0001 Skill 0.935951 2 122 4.17 0.0176 Complexity x skill 0.843825 6 244 3.60 0.0019 Panel B: test of skill effects – HA ¼ high-skill subjects will not significantly over- or understated risk assessments, while low-skill subjects will make significantly over- or understated risk assessments Number of subjects 48 83 High skill Low skill Cases Mean p-value Mean p-value Case 2 2 4.04 0.1785 6.62 0.0016 Case 3 2 2.24 0.4258 7.27 0.0002 Case 4 1.22 0.6407 3.47 0.0547 Notes: The repeated measures analysis of variance test shows that skill and the interaction of skill with task complexity significantly affect subjects’ risk assessment judgments; means and test for significance from zero: Panel B shows the results of testing the mean risk assessments for significant differences from zero; univariate test for within subject effects: low versus high skill: F-value ¼ 3.27, p-value ¼ 0.0395; italicized values represent significant difference from zero at 0.05 alpha level; bold values represent significant difference from zero at 0.10 alpha level



Table V. Risk assessment judgment means and significance tests (tests of H4a)



MAJ 25,8



748



thus, their risk assessments are not significantly over- or understated. However, the low-skill subjects’ risk assessments are significantly different from zero ( p-values , 0.0547) in the overstatement direction. The results support H4a: high-skill subjects make more accurate risk assessments (not significantly different from the expert system’s risk assessments), while low-skill subjects make more overstated risk assessments. The solid line in Figure 2 shows these results for H4a. H4b summarizes skill’s interactive effect with task complexity as predicted in Bonner’s (1994) model. As task complexity increases, the change in risk assessments from simple to complex tasks is greater for low-skill auditors than for high-skill auditors. Table VI summarizes the results for this hypothesis. To test the interactive hypothesis, we first calculated the difference between the simple task and complex task means for high- and low-skill subjects for each case. We then performed t-tests of these mean differences for each case. For all three cases, the change in risk assessments from simple to complex tasks is significantly different between high- and low-skill subjects (Case 2, t ¼ 5.27, p value ¼ 0001; Case 3, t ¼ 4.59, p value ¼ 0.0001; Case 4, t ¼ 3.55, p value ¼ 0.0007). The change in risk assessments for high-skill subjects is significantly less than the change in risk assessments for low-skill subjects[8]. This result supports H4b. Figure 3 shows these results graphically. Additional analyses for within-subject effects Our experimental design allows examination of how task complexity affects risk assessment judgment over time. Bonner’s (1994) model and cognitive research do not specifically address the effects of repetitive judgments so no hypotheses were developed. The results in Tables II-IV show subjects initially significantly overstate their risk assessments of payroll internal controls under all task complexity conditions (case 1 p-values , 0.01 in the positive direction). As the subjects complete the task three more times (cases 2-4), subjects in the complex conditions (complex, low clarity, and high quantity) improve their risk assessments, i.e. not significantly differ from the expert system’s risk assessments[9]. However, subjects in the simple conditions (simple, high clarity, low quantity) continue to significantly overstate their risk assessments, except for case 2 where the subjects’ risk assessments are not significantly different from the expert system’s risk assessments.



Case 2 Case 3 Task Simple Complex Simple Complex complexity Mean Mean Change Mean Mean



Table VI. Risk assessment judgment means and significance tests (tests of H4b)



Skill level High skill Low skill t-test of mean differences



2 1.95 2 1.60 2 1.50 23.31 t-value p-value



0.35 24.81 5.27 0.0001



Change



6.70 2 30.00 236.70 8.07 2 11.38 219.45 t-value 4.59 p-value 0.0001



Case 4 Simple Complex Mean Mean



Change



10.30 4.00 14.14 26.15 t-value p-value



2 6.30 2 20.29 3.55 0.0007



Notes: Test of interaction of skill and task complexity – HA ¼ as task complexity increases, the change in risk assessments for high-skill auditors is significantly less than the change in risk assessments for low-skill auditors



We also examine the interactive effects of task complexity and auditor skill levels on risk assessment judgments made repetitively. Table VII shows that for simple tasks, both high- and low-skill subjects begin with risk assessments not significantly different from zero, i.e, not significantly different from the expert system’s risk assessments. As they continue to make risk assessments under the simple task, both high- and low-skill subjects make overstated risk assessments, i.e. significantly different from zero in a positive direction. The results for complex tasks show that high- and low-skill subjects perform differently. Although no particular pattern exists in these results, two of the three cases show that high-skill subjects make risk assessments similar to the expert system’s risk assessments, i.e. not significantly different from zero,[10] while the low-skill subjects make risk assessments significantly different from zero, both over- and understating control risk. On a positive note, although they reverse from over- to understatement, the low-skill subjects improve their risk assessments over repetitive assessments, moving closer to the expert system’s risk assessments.



The effects of task complexity 749



Additional analyses for mixed dimensions of task complexity The main analyses examined the two manipulated conditions for high (low clarity; and high quantity) and low (high clarity; and low quantity) task complexity. These additional analyses examine the effects of the other two manipulated conditions: low clarity, low quantity (low/low) and high clarity, and high quantity (high/high). The repeated measures analysis of variance test, Table VIII, Panel A, finds a significant effect for complexity (using Wilks’ Lamda: F ¼ 23.54, p-value , 0.0001). Therefore, we test the mean risk assessments for significance from zero to test for overor understatements of risk assessment. Table VIII, Panel B, presents the results of testing these two conditions. For cases 1 and 3, subjects significantly overstate their risk assessments, while subjects make risk assessments that do not differ significantly from the expert system’s risk assessments for case 2. Case 4 results are only marginally significant, such that subjects’ risk assessments are closer to the expert system’s risk assessments than overstatement ( p ¼ 0.1013, low/low; p ¼ 0.0646, high/high). High skill Case Complex tasks No. of subjects Case 2 Case 3 Case 4 Simple tasks No. of subjects Case 2 Case 3 Case 4



Mean



Low skill p-value



Mean



0.8425 0.0001 0.5700



23.31 2 11.38 2 6.15



5 2 1.60 2 30.00 4.00



26



20 2 1.95 6.70 10.30



p-value



0.0001 0.0008 0.0479 14



0.6284 0.0782 0.0040



21.50 8.07 14.14



0.7553 0.0759 0.0010



Notes: Summary of means by task complexity and skill level; repeated measures analysis of variance univariate test for within-subject effects: interaction of skill and task complexity ¼ F-value ¼ 3.15, p ¼ 0.0054; italicized values represent significant difference from zero at 0.05 alpha level; bold values represent significant difference from zero at 0.10 alpha level



Table VII. Risk assessment judgment means and significance tests additional analyses



MAJ 25,8



750



Table VIII. Risk assessment judgment means and significance tests additional analyses



Panel A: multivariate analysis of complexity – all four conditions Source Wilks’ Lamda Num df Den df Multivariate F Case 0.620940 3 125 25.44 Complexity 0.276439 9 304 23.54 Panel B: low, low and high, high – no theoretical predictions for these tasks Number of subjects 34 32 Low, low High, high Cases Mean p-value Mean p-value Case 1 21.76 0.0001 9.41 0.0012 Case 2 2 2.44 0.4448 2 0.03 0.9939 Case 3 19.91 0.0001 6.78 0.0276 Case 4 2 4.47 0.1013 5.21 0.0646



Significance 0.0001 0.0001



Notes: The repeated measures analysis of variance test above shows that task complexity significantly affects subjects’ risk assessment judgments; means and test for significance from zero: Panel B shows the results of testing the mean risk assessments for significant differences from zero; italicized values represent significant difference from zero at 0.05 alpha level; bold values represent significant difference from zero at 0.10 alpha level



Except for case 4, the results in Table VIII, Panel B, are very similar to the simple, high clarity, and low quantity conditions, indicating the low/low, high/high conditions are closer to a simple task than a complex task on a possible task-complexity continuum. Observing the means in Table VIII, Panel B, we find the low clarity, low quantity condition results in larger overstated risk assessments, i.e. those further from zero and the expert system’s risk assessments, for cases 1 and 3 than the high clarity and high quantity condition. These larger overstatement values mirror the complex task results, while the lower overstatement values mirror the simple task results. Although both the low/low and high/high conditions are similar to the simple task condition, the high/high condition is more similar than the low/low condition. These comparisons suggest that clarity of information dominates quantity of information because the clarity element determines whether the high, high or low, low conditions are simpler or more complex. However, the results do not reflect a strong pattern for a clear task complexity continuum. Additional research is required to make any definitive conclusions[11]. V. Discussion and conclusions Using a within- and between-subjects experimental design, we investigated the effects of skill and task complexity on risk assessment judgments over time. We use Bonner’s (1994) model and cognitive load research in developing the hypotheses. The experimental design allows exploratory research into the mixed dimensions of task complexity (i.e. high clarity, high quantity and low clarity, and low quantity) not generally examined in prior research. We first examine the effects of task complexity without regards to skill level. The results primarily show that for each condition – high clarity, low clarity, high quantity, and low quantity – subjects, on average, significantly overstate their risk assessment judgments for payroll internal controls. This finding holds when we examine the combinations of these dimensions. For simple (high clarity, low quantity), complex (low clarity, high quantity), and mixed dimensions (high clarity, high quantity and low clarity, low quantity), subjects significantly overstate the risk assessment judgments for



payroll internal controls. These findings suggest the use of conservatism in assessing control risk, preferring to assess control risk too high, resulting in more auditing than may be necessary. We also examine the effects on internal control risk assessment judgments for tasks designed as high clarity, high quantity and low clarity, low quantity. Although both conditions are more similar to simple tasks, the low, low condition mirrors the complex tasks condition more than the high, high condition. This result indicates that clarity of information dominates quantity of information in defining task complexity. More research is needed in this area to definitively construct a task complexity continuum with the task complexity dimensions. When skill is considered, the results show that high-skill subjects make risk assessment judgments for payroll internal controls similar to the expert system, while low-skill subjects overstate their risk assessment judgments. When skill level and task complexity interact, we find the mediating effect proposed in Bonner’s (1994) model: the change in risk assessments from simple to complex tasks is less negative for high-skill subjects than for low-skill subjects. A unique feature of our study is the subjects perform the task several times, providing a look at changes in performance over a short period of time. Subjects determine risk assessment judgments for payroll internal controls for four cases within their assigned experimental condition. The results show that for low clarity, high quantity, or complex tasks, subjects’ risk assessment judgments improve over time, moving towards the expert system risk assessments. However, for high clarity, low quantity, or simple tasks, subjects’ risk assessments remain, or become even more, significantly overstated over time. These results are contrary to the premise that suggests risk assessment judgments are worse for complex tasks than for simple tasks based on limited memory capacity as used in Bonner’s model and cognitive load research. These results support an argument that, as an auditor acquires experience with a task, the auditor improves his/her performance on that task. Our study did not provide feedback to the subjects between performances of the repeated tasks, a method that might improve judgments for repetitive simple tasks. Future research needs to examine whether providing feedback between task performances improves auditor judgments for repetitive simple tasks. These findings extend to both high- and low-skill subjects. For complex tasks, high-skill subjects’ risk assessments were mostly not significantly different from the expert system’s risk assessments, and the low-skill subjects’ risk assessments improved over time, moving closer towards the expert system’s risk assessments. For both skill groups, performing complex tasks over a short period of time appears to improve risk assessment judgments. However, for simple tasks, both skill groups progressively overstate their risk assessment judgments, consistent with prior research studying effects of skill levels. Novice subjects (i.e. low-skill subjects) frequently overestimate control risk assessments with over-confidence (Hornick and Ruff, 1997). High-skill subjects often perform worse when presented with simple tasks (Arnold and Sutton, 1998; Mascha, 2001; Mascha and Smedley, 2007) or, at best, do not improve further (Mascha and Smedley, 2007). These more experienced subjects may find simple tasks boring or beneath their skill level; therefore, they do not attend to or do not process all of the information, resulting in poor judgments (Mascha, 2001; Miller et al., 2006).



The effects of task complexity 751



MAJ 25,8



752



The results imply the following two comments for audit practice. First, risk assessment judgments vary between skill levels and complexity levels, even over a short period of time. Our findings generally support current practice, where more experienced (presumably high-skill) auditors perform more complex risk assessments. Fortunately, we find both experienced and novice auditors performing complex risk assessments improve their risk assessment judgments as they make more and more of these assessments. Unfortunately, both experienced and novice auditors performing simple risk assessments make more and more overstated risk assessments, i.e. assessing control risk too high. Performing risk assessments where the information is clear and concise (i.e. simple task) may cause high- and low-skill auditors to become overconfident and not pay close attention to the available information, thus assessing control risk too high and performing too much auditing. Second, conservatism clearly plays a dominant role in auditor’s risk assessment judgments. Under almost all of the conditions where the subjects’ risk assessment judgments did not approximate the expert system risk assessment, the subjects overstated their risk assessment judgments. For complex situations, this error is probably best, but for simple situations, this error can increase the cost of the audit considerably and unnecessarily. For researchers incorporating task complexity as part of an experimental design, both clarity and quantity of information are important in the design; specifically, clarity is more important than quantity for measuring task complexity. In addition, skill levels of the subjects significantly affect the comparison between simple and complex tasks. This study is not without potential limitations, and future research could address these concerns. First, the artificial classroom setting in which the task was performed raises questions of subject attention to task. The researchers did not observe any severe lack of attention to the task or note any experimental demand effects. The subjects also performed the task several times, so they all were familiar with the task requirements. Second, the subjects are students, not practicing auditors. While students are adequate surrogates for beginning auditors (seminal study Ashton and Kramer, 1980) and for testing theory, the results may not generalize to auditors as a whole. We also had a small sample size for our high-skill subjects performing a complex task; however, the results are as expected. Finally, we used only one task. Future research could use a variety of tasks performed by practicing auditors, such as audit area judgments (e.g. adequacy of allowance for doubtful accounts, warranty expense/payable, etc.) and audit opinion determination (e.g. going concern, explanation of a significant matter, etc.). Notes 1. Because congress intended the inspection process to be public, the PCAOB publishes its inspections and findings on its web site (www.pcaob.org/Inspections/). 2. Our paper focuses on task complexity and the mediating effects of skill. We leave the mediating effects of motivation for future research. 3. We designed the experiment to examine several issues concerning task complexity, some of which are not included in this paper but may be included in future papers. 4. Bonus points are a common reward given to students for participation in research studies. Our students received a minimal number of bonus points, representing less than 2-3 percent



of the total points for the audit course. Instructors noted the bonus points provided sufficient motivation for students to attend to the tasks. 5. Although one of the authors was present while the subjects performed the experiment, the subjects were not aware of the specific hypotheses tested or the manipulated variables. The subjects were only told that they were participating in a study of how auditors make internal control risk assessments. Thus, we do not believe demand effects occurred because of the author’s presence during the experiment. 6. Since both ends of the continuum, positive and negative scores, represent poor judgment while scores near zero represent good judgment, a continuous measure for skill is not interpretable for determining over- or understatement of control risk as stated in the hypotheses. Using the absolute value of the scores also does not provide the interpretation necessary for testing the hypotheses. 7. The median score is 15, not zero. Since zero is the most correct score, we do not use a median split for identifying high- and low-skill subject groups. 8. We acknowledge a potential limitation on these findings because of the small number of subjects in the high-skill group (five subjects). 9. For complex and high quantity conditions, the improvement continues through case 4. For low clarity, the improvement occurs through case 3, but case 4 the risk assessments are again significantly different from zero, in this case, significantly understated. 10. We acknowledge a potential limitation on these findings because of the small number of subjects in the high-skill group (five subjects). 11. We recognize that the relative effects of clarity and quantity may be more task specific than the task we use in this study, potentially resulting in different relative effects. We acknowledge this issue as a potential limitation of the study. References Abdolmohammadi, M. and Wright, A. (1987), “An examination of the effects of experience and task complexity on audit judgments”, The Accounting Review, Vol. 62 No. 1, pp. 1-13. Anderson, J.R. (1983), “A spreading activation theory of memory”, Journal of Verbal Learning and Verbal Behavior, Vol. 22, pp. 261-95. Arnold, V. and Sutton, S.G. (1998), “The theory of technology dominance”, Advances in Accounting Behavioral Research, Vol. 1, pp. 175-94. Ashton, R. and Kramer, S. (1980), “Students as surrogates in behavioral accounting research: some evidence”, Journal of Accounting Research, Spring, pp. 1-15. Basru, S. (1997), “The conservatism principle and the asymmetric timeliness of earnings”, Journal of Accounting and Economics, Vol. 24 No. 1, pp. 2-37. Bierstaker, J.L. and Brody, R. (2001), “Presentation format, relevant domain experience and task performance”, Managerial Auditing Journal, Vol. 16, pp. 124-8. Bonner, S. (1994), “Model of the effects of task complexity”, Accounting, Organizations and Society, Vol. 19 No. 3, pp. 213-34. Bonner, S. and Lewis, B.L. (1990), “Determinants of auditor expertise”, Journal of Accounting Research, Vol. 28 No. 3, pp. 1-20. Chaney, P.K. (2002), “Shredded reputation: the cost of audit failure”, Journal of Accounting Research, Vol. 40 No. 4, pp. 1221-45. Eining, M. and Dorr, P. (1991), “The impact of expert system usage on experiential learning in an auditing setting”, Journal of Information Systems, Spring, pp. 1-16.



The effects of task complexity 753



MAJ 25,8



Hornick, S. and Ruff, B. (1997), “Expert system usage and knowledge acquisition: an empirical assessment of analogical reasoning in the evaluation of internal controls”, Journal of Information Systems, Vol. 11 No. 2, pp. 57-74. Iselin, E. (1988), “The effects of information load and information diversity on decision quality in a structured decision task”, Accounting, Organizations and Society, Vol. 13 No. 2, pp. 146-64.



754



Kerlinger, F. (1989), Foundations of Behavioral Research, 3rd ed., Harcourt Brace Jovanovich, Fort Worth, TX. Libby, R. and Libby, P. (1989), “Expert measurement and mechanical combination in control reliance decisions”, The Accounting Review, Vol. 64 No. 4, pp. 729-47. Mascha, M.F. (2001), “The effect of task complexity and expert system type on the acquisition of procedural knowledge: some new evidence”, International Journal of Accounting Information Systems, No. 2, pp. 103-24. Mascha, M.F. and Smedley, G. (2007), “Can computerized decision aids do ‘damage’? A case for tailoring feedback and task complexity based on task experience”, International Journal of Accounting Information Systems, Vol. 8 No. 2, pp. 73-91. Miller, C.L., Fador, D. and Ramsay, R.J. (2006), “Effects of discussion of audit reviews on auditors’ motivation and performance”, Behavioral Research in Accounting, Vol. 18, pp. 135-46. Odom, M. and Dorr, P. (1995), “Epistemological issues of expert system use: an experimental investigation of knowledge acquisition and retention”, Journal of Information Systems, Spring, pp. 1-17. PCAOB (2007a), Inspections, release No. 2007-001, observations on Auditors’ Implementation of PCAOB Standards Relating to Auditors’ Responsibilities with Respect to Fraud, January 22, 2007, available at: www.pcaob.org/Inspections/Other/2007/01-22_Release_ 2007-001.pdf (accessed April 14, 2009). PCAOB (2007b), Standards, Auditing Standard No. 5: An Audit of Internal Control Over Financial Reporting that is Integrated with an Audit of Financial Statements. July 25, 2007, available at: www.pcaob.org/Standards/Standards_and_Related_Rules/Auditing_ Standard_No.5.aspx (accessed July 15, 2009). Ricchiute, D.N. (2006), Auditing, 8th ed., South-Western, New York, NY. Rose, J.M. and Wolfe, C.J. (2000), “The effects of system design alternatives on the acquisition of tax knowledge from a computerized tax decision aid”, Accounting, Organizations and Society, Vol. 25 No. 3, pp. 285-306. Sweller, J. and Chandler, P. (1991), “Cognitive load theory”, Cognition and Instruction, Vol. 8 No. 4, pp. 351-62. Tan, H.T. and Kao, A. (1999), “Accountability effects on auditors’ performance: the influence of knowledge, problem-solving ability, and task complexity”, Journal of Accounting Research, Vol. 37, pp. 209-23. Tan, H.T., Ng, T.B. and Mak, B.W.Y. (2002), “The effects of task complexity on auditors’ performance: the impact of accountability and knowledge”, Auditing: A Journal of Practice & Theory, Vol. 21, pp. 81-96. Watts, R.L. (2003a), “Conservatism in accounting Part I: explanations and implications”, Accounting Horizons, Vol. 17 No. 3, pp. 207-21. Watts, R.L. (2003b), “Conservatism in accounting Part II: evidence and research opportunities”, Accounting Horizons, Vol. 17, pp. 287-301.



Wright, A. and Wright, S. (1996), “The relationship between assessments of internal control strength and error occurrence, impact and cause”, Accounting & Business Research, Vol. 27, pp. 58-71. About the authors Maureen Francis Mascha is an Assistant Professor of Accounting at Marquette University where she teaches accounting information systems, fraud examination, and accounting theory. Her research interests include the effect of expert systems and decision aids on judgment. Maureen Francis Mascha is the corresponding author and can be contacted at: maureen.mascha@ mu.edu Cathleen L. Miller is an Associate Professor of Accounting at Wayne State University where she teaches auditing and financial accounting. Her research interests include behavioral and experimental research in auditing and financial accounting.



To purchase reprints of this article please e-mail: [email protected] Or visit our web site for further details: www.emeraldinsight.com/reprints



The effects of task complexity 755